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1 .0  Executive  Summary 


This  report  presents  an  approach  to  a methodology  for  the  statistical 
analysis  of  ejection  performance  data  from  the  U.S.  Navy's  Aircrew  Automated 
Escape  Systems  (AAES).  A central  problem  area  in  any  analysis  of  AAES  ejec- 
tion data  is  a comprehensive  examination  of  injuries  associated  with  the  ejec- 
tions. Accordingly,  the  statistical  analysis  methodology  developed  herein 
is  oriented  primarily  toward  an  in-depth  critical  examination  of  aircrew  in- 
jury data.  However,  the  methodology  can  be  applied  with  equal  facility  to 
any  other  AAES  data  which  is  amenable  to  statistical  analysis. 

Basically,  this  report  initially  surveys  selected  analyses  already  applied 
to  AAES  data.  Results  are  discussed.  Although  many  reports  have  been  developed 
in  the  area  of  AAES  studies,  the  particular  document  selected  for  examination 
here  is  the  report  dated  8 November  1974,  "An  Evaluation  of  the  Effects  of 
Spreader  Gun  Usage  in  Navy  Ejection  Seat  Personnel  Parachute  Subsystems," 
by  Messrs.  Robert  L.  Wallace,  James  W.  Pope  and  Frederick  C.  Guill  of  NAVAIR 
53121.  Appendix  I in  that  report,  entitled,  "Statistical  Evaluation  of  ESCAPAC 
Series  Ejection  Seat  Components'  Correlation  with  Incidence  of  Neck  Injuries , " 
is  comprised  of  four  subsections.  Specific  attention  here  will  be  devoted 
to  subsection  3,  entitled,  "Correlation  Analysis  of  Potential  Causal  Factors 
for  Increase  in  Neck  Injuries  Sustained  During  Ejections  Using  HS-1A  (RA-5C) 
Ejection  Seats,"  and  subsection  4,  entitled,  "Contingency  Analysis  of  RA-5C 
Escape  Data." 

Next,  an  approach  to  a reasonable  statistical  analysis  methodology  is 
proposed.  This  consists  of  (1)  data  analysis  (re-format,  if  necessary),  (2) 
discrete  probability  density  functions,  (3)  application  of  Bayesian  statisti- 
cal theory,  (4)  higher  order  contingency  table  analysis,  (5)  non-parametric 
statistical  tests,  (6)  continuous  probability  density  functions  (normal,  gamma 
and  beta),  (7)  small  sample  statistical  analysis,  (8)  sampling  probability 
density  functions  (Chi-square,  t,  and  F),  and  (9)  application  of  confidence 
limits.  A bibliography  pertinent  to  statistical  ideas  is  included.  A few 
references  to  selected  U.S.  Navy  documents  studied  during  the  course  of  this 
project  also  are  listed. 


The  lack  of  systematic  processes  for  collection  and  evaluation  of  AAES 
operating  statistics  has  hampered  NAVAIR  in  its  response  to  developing  AAES 
problems.  As  an  example,  a study  recently  completed  on  the  spreader  gun 
used  in  an  AAES  (see  the  report  mentioned  above)  was  initiated  after  the  dis- 
covery by  the  Naval  Aviation  Safety  Center  of  a significant  difference  between 
the  number  of  neck  injuries  occurring  with  various  types  of  ESCAPAC  ejection 
seats  in  the  A-4  aircraft  and  the  HS-1/HS-1A  ejection  seats  in  the  A-5  air- 
craft. Initially,  the  difference  in  number  of  neck  injuries  was  attributed 
by  the  Naval  Aviation  Safety  Center  to  the  spreader  gun.  Subsequent  in-depth 
analyses,  however,  revealed  that  neck  injuries,  among  A-4  users,  were  highly 
correlated  with  the  presence,  or  absence,  of  a ballistic  powered  haulback 
type  inertia  reel.  In  A-5  aircraft,  neck  injuries  were  thought  to  be  corre- 
lated with  the  malfunction  of  another  system  element. 

The  neck  injury  analysis,  and  similar  past  studies,  indicated  the  need 
for  a comprehensive  and  independent  study  of  all  ejection  systems  to  evaluate 
their  quality,  effectiveness,  reliability  and  maintainability.  Moreover, 
analysis  of  past  testing,  in  view  of  current  in-service  experience  in  AAES 
quality,  would  help  determine  what  relationships  exist  between  test  results 
and  actual  in-service  experience. 


A one-time  study  of  existing  systems  is  not  sufficient  to  solve  NAVAIR1 s 
problem  in  this  area.  A long-term  solution  must  include  the  collection  and 
processing  by  NAVAIR  of  AAES  performance  data  on  a continuing  basis.  Compli- 
menting data  collection  is  the  need  for  development  of  an  analysis  methodology 
which  will  assist  NAVAIR  with  its  mission  in  keeping  with  the  following: 

• Efficient  allocation  of  its  AAES  resources 

• Provide  more  AAES  responsiveness  to  the  Fleet 

• Resolve  AAES  problems  in  a manner  that  enhances  in-service  system 
reliability  and  performance. 

To  assist  NAVAIR  with  selected  AAES  problem  areas,  this  study  is  con- 
structed in  four  distinct  phases:  (1)  Phase  I,  establishment  of  an  analysis 

methodology,  (2)  Phase  II,  application  of  the  analysis  methodology  to  a single 


.'vsl; 


AAES , (3)  Phase  III,  application  of  the  analysis  methodology  to  a set  of 
AAES's  and  (4)  Phase  IV,  (based  on  results  and  recommendations  in  Phases  I, 

II  and  III)  institute  a scheme  which  will  monitor  AAES  production/ test  activ- 
ities on  a continuing  basis.  Proposed  future  efforts  under  Phases  II,  III 
and  IV  of  the  AAES  study  conclude  this  report. 


2.0  Introduction 


Many  reports  and  other  documents  have  been  developed  by  various  organi- 
zations to  study  problems  in  the  area  of  Aircrew  Automated  Escape  Systems. 

One  recent  report  (8  November  1974),  "An  Evaluation  of  the  Effects  of  Spreader 
Gun  Usage  in  Navy  Ejection  Seat  Personnel  Parachute  Subsystems,"  by  R.L. 
Wallace,  J.W.  Pope  and  F.C.  Guill,  confirmed  that  statistical  analyses,  such 
as  correlation  analysis  and  contingency  table  analysis,  can  be  used  effec- 


tively to  analyze  problem  areas  in  AAES.  Central  to  the  investigative  ef- 
forts in  the  above  report  was  an  analysis  of  injury  data,  specifically,  neck 
injuries . 

This  report  surveys  two  statistical  methods  used  in  the  report  by  Wallace, 
et  al:  (1)  correlation  analysis,  and  (2)  contingency  table  data  analysis. 

After  that  survey,  a reasonable  approach  to  a statistical  analysis  methodology 
is  proposed.  It  includes  the  following: 

• Data  Analysis — Here,  the  data  are  to  be  given  a multiple  classifi- 
cation scheme  in  consonance  with  MORs  such  as  1 - no  injury /minimal 
injury,  2 - minor  injury,  3 - major  injury,  4 - fatality,  and  5 - 
other,  to  include  lost  and  unknowns. 

• Discrete  Probability  Density  Functions — Three  discrete  density  func- 
tions were  chosen:  (1)  binomial,  useful  in  analyzing  dichotomous 

situations,  (2)  Poisson,  useful  in  attacking  special  binomial  prob- 
lems, as  well  as  analyzing  reliability  problems,  and  (3)  multino- 
mial, useful  in  analyzing  events  having  two  or  more  independent 
outcomes . 

• Bayesian  Statistical  Theory — This  is  useful  in  refining  a priori 
probability  estimates. 

• Higher  Order  Contingency  Table  Data  Analysis — This  is  useful  in 
testing  the  hypothesis  that  two  characteristics  are  independent. 

• Fisher's  Exact  Test — This  test  is  a method  for  obtaining  exact  prob- 
abilities of  the  occurrence  of  events  among  entries  in  a contingency 
table. 

• Non-Parametric  (Distribution  Free)  Statistical  Tests — These  tests 
are  designed  to  tell  whether  phenomena  represented  by  the  data  occur 
randomly  or  have  an  underlying  deterministic  trend. 
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The  main  body  of  the  report  is  concluded  by  a bibliography  containing 
references  in  the  AAES  area  as  well  as  in  the  field  of  statistical  analysis. 

There  are  two  appendixes  attached  to  this  report:  Appendix  A contains 
additional  statistical  analyses,  such  as  a survey  of  selected  continuous  prob- 
ability density  functions,  and  selected  sampling  probability  density  func- 
tions, such  as  the  Chi-square,  t and  F density  functions.  A survey  of  con- 
fidence limits,  with  examples,  is  given.  This  enables  one  to  attach  a given 
confidence  to  a probabilistic  statement  that  is  made. 

Appendix  B contains  a survey  of  proposed  future  efforts  in  Phases  II, 

III,  and  IV  of  this  project.  Specifically,  Phase  II  will  be  application  of 
the  statistical  analysis  methodology,  derived  in  Phase  I,  to  actual  opera- 
tional data  for  a specific  AAES.  A single  attribute  data  stream,  as  well 
as  dual  attribute  data  streams,  will  be  studied.  An  analysis  of  the  results 
obtained  will  be  given. 

Phase  III  will  consist  of  applying  the  statistical  analysis  methodology 
derived  in  Phase  I,  as  refined  by  Phase  II,  to  a class  of  ejection  seats. 

i 

Conclusions  and  recommendations  would  be  a natural  end  product  of  the  Phase 
III  analysis. 

Activities  in  Phase  IV,  that  is  development  of  an  AAES  monitoring  system, 
is  sketched.  Details  of  procedure  in  Phase  IV  must,  of  necessity,  depend 
upon  output  from  Phases  II  and  III. 
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Section  3.0 


A.  SURVEY  OF  SELECTED  STATISTICAL 
ANALYSES  APPLIED  TO  DATE 
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Basic  material  for  the  discussion  that  follows  is  found  in  sub-sections 
3 and  4 of  Appendix  I in  reference  13,  specifically,  [l3,  pp  235-279]*.  In 
these  pages  it  is  evident  that  two  primary  statistical  analyses  were  used: 
(1)  correlation  analysis,  and  (2)  contingency  table  data  analysis.  Each  of 
these  will  be  discussed  in  turn. 


Correlation  Analysis 


Injury  data  analysis  in  Appendix  I of  the  referenced  report  has 
been  restricted  largely  to  neck  injuries.  In  those  analyses,  a binary  desig- 
nation is  employed:  1 — indicates  neck  injury,  0 — indicates  no  injury.  No 
parent  probability  density  function  has  been  associated  with  the  given  binary 
designation.  To  illustrate  correlation  analyses  employed,  consider  Tables 
3-1,  3-2,  3-3  and  3-4  found  in  [l3,  pp.  273-276].  In  Table  3-1,  several  obser- 
vations can  be  made:  (1)  the  relative  frequency  of  a neck  injury  [5,  p.  36; 

11,  p.  99]  sometimes  defined  as  probability  of  occurrence  of  a neck  injury, 
for  the  sample  displayed  in  the  table  is  = 3/17;  (2)  the  probability  of 
no  neck  injury  is  1 - P = 14/17;  (3)  all  neck  injuries  in  the  sample  examined 
occurred  whenever  the  HS-1A  (upgraded)  ejection  seat  was  used;  (4)  the  rela- 
tive frequency  of  ejection  with  an  HS-1A  (upgraded)  ejection  seat  is  P^  = 8/17; 
and  (5)  relative  frequency  of  ejection  with  an  HS-1  seat  is  1 - P = 9/17. 

Here  it  must  be  emphasized  that  the  sample  size  (n  = 17)  is  very  small,  hence 
little  confidence  can  be  placed  in  the  above  probability  statements.  Addi- 
tional data,  hopefully,  with  strength  the  confidence  one  can  invest  in 
validity  of  the  preceding  probability  statements. 


Relative  frequency  was  the  measure  used  to  compute  mean  values  for: 
(1)  neck  injury,  (2)  seat  designation,  and  (3)  speed.  This  enabled  the  in- 
vestigator to  construct  Table  3-2  containing  numerical  values  for  y - y, 


TABLE  3-1* 


HIGH  SPEED  EJECTIONS  FROM  THE  RA-5C  AIRCRAFT 
(Observed  Data) 


BDD 

x - Seat  Designation 

s = Aircraft  Speed 

i 

1 

260 

0 

1 

260 

1 

1 

400 

0 

1 

400 

0 

1 

300 

0 

1 

300 

1 

1 

400 

0 

1 

400 

0 

0 

200 

0 

0 

200 

0 

0 

220 

0 

0 

220 

0 

0 

450 

0 

0 

200 

0 

0 

230 

0 

0 

230 

0 

0 

230 

Z y.  = 3 

RjBBi 

■HliH 

J X 

mmumm 

n = 17  x = 0.471  y = 0.1765 

i = 288.2 

Ejectees  sustaining  a neck  injury  of  any  type  are  denoted  by  a "1";  and 
all  others  by  a "0". 


Ejectees  using  HS-1A  (upgraded)  ejection  seat  are  identified  bv  a "1"; 
those  using  the  HS-1  (unmodified)  by  a "0". 


♦This  table  is  shown  as  Table  1-14,  on  page  273  in  Reference  13.  (See  Sec- 
tion 6 for  a complete  list  of  references.) 
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TABLE  3-2* 


HIGH  SPEED  EJECTIONS  FROM  THE  RA-5C  AIRCRAFT 
(Data  Modified  by  Mean  Values) 


yi  * y 

xi  " 5 

s . - s 

l 

+0.824 

0.529 

-28.2 

-0.177 

0.529 

-28.2 

+0.824 

0.529 

111.8 

-0.177 

0.529 

111.8 

-0.177 

0.529 

11.8 

-0.177 

0.529 

11.8 

+0.824 

0.529 

111.8 

-0.177 

0.529 

111.8 

-0.177 

-0.471 

-88.2 

-0.177 

-0.471 

-88.2 

-0.177 

-0.471 

-68.2 

-0.177 

-0.471 

-68.2 

-0.177 

-0.471 

161.8 

-0.177 

-0.471 

-88.2 

-0.177 

-0.471 

-58.2 

-0.177 

-0.471 

-58.2 

-0.177 

-0.471 

-58.2 

*This  table  is  shown  as  Table  1-15,  on  page  274  in  Reference  13. 
(See  Section  6 for  a complete  list  of  references.) 
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TABLE  3-3* 


HIGH  SPEED  EJECTIONS  FROM  THE  RA-5C  AIRCRAFT 
(Elements  in  the  Q-Matrix) 


q 

yy 

q 

XX 

qss 

q 

yx 

qys 

qxs 

0.678 

0.280 

795 

+0.436 

-23.1 

-14.8 

0.031 

0.280 

795 

-0.094 

+5.0 

-14.8 

0.678 

0.280 

12,544 

+0.436 

+92.3 

+59.2 

0.031 

0.280 

12,544 

-0.094 

-19.8 

+59.2 

0.031 

0.280 

144 

-0.094 

-2.1 

+6.3 

0.031 

0.280 

144 

-0.094 

-2.1 

+6.3 

0.678 

0.280 

12,544 

+0.436 

+92.3 

+59.2 

0.031 

0.280 

12,544 

-0.094 

-19.8 

+59.2 

0.031 

0.222 

7,744 

+15.6 

+41.4 

0.031 

0.222 

7,744 

0.083 

+15.6 

+41.4 

0.031 

0.222 

4,624 

0.083 

+12.0 

+32.0 

0.031 

0.222 

4,624 

0.083 

+12.0 

+32.0 

0.031 

0.222 

26,244 

0.083 

-28.7 

-76.3 

0.031 

0.222 

7,744 

0.083 

+15.6 

+41.2 

0.031 

0.222 

3,364 

0.083 

+10.3 

+27.3 

0.031 

0.222 

3,364 

0.083 

+10.3 

+27.3 

0.031 

0.222 

3,364 

0.083 

+10.3 

+27.3 

Z = 2.469 

Z = 4.238 

Z = 2.525 

Z = 183.7 

Z = 413.4 

♦This  table  is  shown  as  Table  1-16  on  page  275  in  Reference  13.  (See  Section 
6 for  a complete  list  of  references.) 


i 4 
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TABLE  3-4* 

HIGH  SPEED  EJECTIONS  FROM  THE  RA-5C  AIRCRAFT 
(The  Q-Matrix  and  Partial  Correlation  Coefficients) 


y j 2.469  2.525  183.7 

Q = x ( 2.525  4.238  413.4 

s \183 . 7 413.4  120,870 


r 0.497 

crit95 


r . = 0.623 

crit99 


2.525 

(2.469)  (4.238) 


= 0.7806 


183.7  _ „ 

r = = 0.3363 

yS  VC2.469)  (120,870) 


413.4 

(4.238)  (120,870) 


= 0.5776 


= 0.7806  - (0.3363)  (0.5776)  , 

[Vl  - (0.3363)2J  [Vi  - (0.5776)2J 

= 0.3363  - (0.7806)  (0.5776)  = 

Vl  - (0.7806)2J  Vl  - (0.5776)2 


= 0.7627 


- 0.2246 


*This  table  is  shown  as  Table  1-17  on  page  276  in  Reference  13.  (See  Section 
6 for  a complete  list  of  references.) 
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x - x,  and  s - s.  Values  for  insertion  into  the  Q-matrix  were  computed  from 

Table  3-2  and  are  listed  in  Table  3-3.  Thus,  q = (y  - y)  (y  - y) , similarly 

for  q , and  q , and  q = (y  - y)  (x  - x).  From  Table  3-3,  numerical  values 
xx  s s yx 

for  insertion  into  the  Q-matrix  were  derived.  The  Q-matrix  is  defined  as 
follows : 


where 


y 

X 

s 

ln 

ql2 

ql3 

l21 

q22 

q23 

l31 

q32 

q33 

17 

17 

qll 

-E 

(qyy)i* 

q12 

■E 

(q  ). 
nyx  l 

> 

i=l 

i=l 

17 

17 

q13 

-E 

(qys)i: 

q21 

“ q12 ' 

q22  = 

E 

(qxx>i; 

i-1 

i-1 

17 

17 

cn 

<M 

cr 

-E 

(qxs>i* 

q31 

= q13’ 

q32  = 

q23  ’ 

and  q^2  1 

= S (qss)i* 

i-1 

i=l 

(3.1) 


From  the  Q-matrix,  correlation  coefficients  are  easy  to  compute.  Thus, 


q12  2.525 

r = ^ = 0.7806 

yX  Vqll  q22  V/(2>469)  <4-238> 


(3.2) 


The  other  linear  correlation  coefficients  are  computed  similarly. 
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Partial  correlation  coefficients  are  computed  as  follows:  [4,  p 496j 


jrx_ 


(r  ) (r  ) 
ys  xs 


y*i  s 


r2  > (1 
ys 


"xs5 


(3.3) 


Insert  numbers  already  developed  to  get: 

0.7806  - (0.3363)  (0.5776) 

yx|s  , 2 T 

V(1  - (0.3363);  (1  - (0.5776); 

This  last  equation  is  interpreted  from  correlation  theory  as  follows: 
The  partial  correlation  coefficient  between  neck  injuries  and  seat  config- 
uration knowing,  or  given,  speed  at  ejection  is  0.7627.  No  explanation  is 
given  for  the  necessity  or  desirability  of  computing  critical  correlation 
coefficients,  other  than  as  a threshold  value.  No  references  were  detected 
giving  the  statistical  theory  underlying  these  coefficients.  No  method  for 
computing  these  coefficients  was  given.  Further,  the  application  of  corre- 
lation analysis  to  a dichotomous  situation  also  has  been  questioned. 


3.2  Contingency  Table  Analysis 

Contingency  Table  Analysis  is  another  statistical  technique  for 
analyzing  ejection  seat  data.  This  technique  was  employed  on  a sample  of 
ejection  data  from  RA-5C  aircraft  which  were  equipped  with  the  HS-1A  (updated) 
ejection  seat.  The  data  are  summarized  in  Table  3-5.  The  data  sample  dis- 
cussed below  is  believed  not  to  be  the  same  as  that  discussed  previously. 

A simple,  but  general,  2x2  contingency  table  is  shown  in  Figure 

3-1  below. 


1 

II 

TOTALS 

A 

a, 

a, 

Na 

B 

b. 

b, 

Nb 

TOTALS 

N, 

**. 

N 

FIGURE  3-1 . A GENERAL  2x2  CONTINGENCY  TABLE 
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TABLE  3-5* 


CONTINGENCY  TABLE  DATA  ANALYSIS  FOR  THE  RA-5C  AIRCRAFT 
(Neck  Injury/No  Neck  Injury  Versus  Updated  Seat/No  Updated  Seat) 


it 

are  found  in  page  278  in  Reference  13.  (See  Section  6 for  a com- 
of  references.) 


V 


*These  data 
plete  list 


Figure  3-1  is  read  as  follows:  An  event  with  attribute  A occurs  a^  times 

under  conditions  I,  and  a ^ times  under  conditions  II.  A mutually  exclusive 
event  with  attribute  B occurs  bj  times  under  condition  I,  and  times  under 

conditions  II.  Clearly,  NA  = ai  + a2>  Nb  = ^1  + ^2’  Ni  = ai  + * N2  = a2  + ^2’ 

and  N = N.  + N = N,  + N„ . 

A B 1 2 

To  test  a hypothesis  developed  in  conjunction  with  a contingency 
table,  use  is  made  of  the  Chi-square  statistical  test.  This  statistical  test 
is  defined  as  follows  for  a k x m contingency  table. 


i=i  j=i 


m , 2 . 


1J 


|i  1,  2,  ...  , k 
^j— 1,  2,  ...  , m 

. .th 


(3.4) 


in  which  o„  = the  observed  frequency  in  the  i,j  cell  in  the  contingency 
table,  and  e.j  = the  corresponding  expected  or  theoretical  frequency  in 
the  same  cell.  Equation  (3.4)  may  be  simplified  somewhat  as  follows: 


2 

o.  . 


= E E — - 2 2 L o..  + E E e. 


e.  . 

i J iJ 


i J 


ij 


i J 


ij 


since 


= E E !ii.N  , 


e . 

i J iJ 


(3.5) 


E Eo..  = E E e.  . = N 


i J 


i J 


ij 


v 

i 

r 

* 

i 

& 

i • i 

r 


Relating  this  to  Figure  3-1,  it  is  seen,  for  example,  that 


_NAN2 
12  N 
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(3.6) 
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Likewise,  the  entire  expression  for  the  Chi-square  statistic  is  from 
Figure  3-1 : 


r 


i ♦ 


r 

f 

0i 

s’  'i 


i 


2 2 

N \ N N 
B \ 1 2 


(3.7) 


To  translate  the  above  into  concrete  AAES  terms,  consider  the  sam- 
ple of  RA-5C  ejection  data  shown  above.  The  following  null  hypothesis  is  to 
be  tested  at  the  0.05  significance  level: 

H : Introduction  of  the  HS-1A  ejection  seat  has  not  resulted 

° in  an  increase  in  the  number  of  neck  injuries  incurred 
upon  ejection. 

This  null  hypothesis  can  be  stated  another  way.  Thus, 

H : Neck  injuries  experienced  upon  ejection  from  the  RA-5C 

° aircraft  are  independent  of  ejection  seat  type  (HS-1/HS-1A) 
used. 


To  test  this  hypothesis,  a 2 x 2 contingency  table  was  analyzed  as 
shown  below.  A sample  of  50  ejections  was  considered.  The  analysis,  self- 
explanatory,  follows  from  Table  3-5. 


<C/d.f.=!  J2 


2 (e  n 2 

(f . . - e.  .) 
iJ 


i=l  j=l 


e . . 
ij 


(3.8) 


2 _ (5  - 1.4)2  . (5  - 8.6)2 

Xc/d.f.=l  " ^ 8-6 

. (2  - 5.6)2  . (38  - 34. 4)2 
5.6  34.4 

= 9.26  + 1.51  + 2.31  + .38  = 13.46 


Theoretical  value  of  the  Chi-square  statistic,  denoted  here  by  x^.> 
has  the  numerical  value 

XT/0.05;d.f .=1  = 3-841' 
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Since  X > X^/q  qj>  updating  the  seat  has  resulted  in  a statistically 
significant  shift  in  neck  injuries.  Therefore,  the  null  hypothesis  is  re- 
jected at  the  0.05  level  of  significance,  and  we  conclude  that,  for  the  sample 
under  investigation,  updating  the  HS-1  to  the  HS-1A  seat  is  highly  correlated 
with  increase  in  neck  injuries.  At  this  point,  it  is  interesting  to  observe 
that  all  neck  injuries  in  the  sample  discussed  here  occurred  upon  ejection 
from  RA-5C  aircraft  using  an  HS-1A  (updated)  seat  escape  system.  The  Yates' 

correction  factor  was  not  applied  to  this  Chi-square  test.  Application  of 

2 

Yates'  correction  factor  yields  Xc  = 9.9772,  and  the  null  hypothesis  is  still 
rejected. 


A coefficient  of  contingency,  C,  can  be  computed  for  this  contin- 


gency test. 


X2  + N 


(3.9) 


For  the  RA-5C  sample  studied. 


c=  ir?45%  50  = 0-4605 


The  coefficient  of  contingency  is  analogous  to  a correlation  coef- 
ficient in  that  it  measures  the  strength  of  a relationship  between  the  two 
variables  analyzed  in  a contingency  table. 

A contingency  table  that  would  be  of  considerable  interest  is  that 
of  fatality /no  fatality  versus  updated  seat/no  updated  seat.  This  kind  of 
statistical  analysis  is  a relatively  simple  yet  powerful  tool  in  the  general 
context  of  statistical  analyses  of  the  AAES  ejection  injury  data. 
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Section  4.0 

PROPOSED  STATISTICAL  ANALYSIS  METHODOLOGY 


4.0  Proposed  Statistical  Analysis  Methodology 


There  are  a wide  variety  of  statistical  analysis  techniques  that  can 
be  applied  to  the  Automated  Aircrew  Escape  Systems.  These  include  the 
following: 

• Data  Analyses. 

• Use  of  discrete  parent  probability  density  functions  such  as  bino- 
mial, Poisson,  and  multinomial. 

• Higher  order  contingency  table  analysis. 

• Application  of  Bayesian  statistical  theory  to  refine  a priori  prob- 
ability estimates. 

• Application  of  non-parametric  (distribution  free)  statistical  tech- 
niques such  as  the  run  and  trend  tests. 

Each  of  the  above  will  be  illustrated  in  the  material  that  follows. 


4. 1 Data  Analysis 

An  ejection  related  injury  data  classification  scheme  that  would 
be  definitive,  amenable  to  statistical  analysis,  and  in  consonance  with  MORs 
classification  is  shown  below.  Here  it  is  noted  that  numerical  values  are 
attached  to  the  MORs  classification.  Advantages  of  employing  such  a scheme 
are  the  following:  (1)  ease  of  applying  statistical/numerical  techniques, 

and  (2)  ease  of  expanding  the  classification  to  obtain  better  subjective  focus 
on  a particular  injury  pattern,  and  (3)  conformity  and  uniformity  of  injury 
classification.  Illustrations  of  each  of  these  are  given  below. 


Injury 

Classification 

(Alpha) 

Injury 

Category 

Injury 

Classification 
(Numerical ) 

Expanded  Injury 
Classification 
(Numerical) 

G 

No/Minimal 

1 

10-19 

F 

Minor 

2 

20-29 

B 

Major 

3 

30-39 

A 

Fatal 

4 

40-49 

L & U 

Lost  and  Unknown 

5 

50-59 
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An  expanded  injury  classification  scheme  also  could  be  used  to  order 
injuries  in  increasing  order  of  severity.  For  example, 


Expanded  Injury 
Classification 

(Numerical ) Range  of  Injury  Severity 


10-19 

20-29 

30-39 

40-49 

50-59 


No/minimal  injury  to  minor  injury 

Minor  to  major  (non-fatal)  injury 

Major  (non-fatal)  to  fatal  injury 

Fatality  (classify  according  to  kind  of  fatality) 

Lost  and  Unknown  (classify  according  to  probable 
cause) 


The  above  numerical  classification  scheme  would  assist  greatly  in 
the  major  injury  area,  where  injuries  could  vary  from,  say,  a sprained  arm 
to  very  severe  contusions  and  lacerations.  In  addition,  such  a scheme  would 
assist  the  statistical/numerical  analyst  in  his  application  of  various  tech- 
niques to  the  injury  data. 


An  example  of  injury  classification  by  aircraft  type  for  a set  of 
actual  ejections  under  combat  conditions  is  shown  in  Table  23  in  [l,  pg.  28]. 
Detailed  information  about  the  various  injuries  is  displayed  in  Table  24  in 
[1,  pg.  28],  Injury  classification  was  restricted  to  major,  minor  and  no 
injury.  These  tables  are  reproduced  here  as  Tables  4-1  and  4-2,  respectively. 


4.2  Discrete  Probability  Density  Functions 

Use  of  probability  density  functions  to  include  the  binomial  parent 
density  function  now  will  be  discussed  in  the  context  of  AAES  ejection  related 
injury  data.  At  the  outset,  three  notes  are  made:  (1)  the  data  selected 

for  analysis  here  are  actual  observations  of  A-6  ejection  related  operational 
injury  data,  (2)  the  data  are  classified  in  keeping  with  the  scheme  outlined 
in  Section  4.1,  above,  and  (3)  the  following  percentages,  based  on  actual  MORs 
data,  are  assigned  to  the  various  injury  categories: 


4-2 


. 


^ it  . 


1 - No/Minimal  injury,  18.69Z 

2 - Minor  injury,  35.52Z 

3 - Non-Fatal  Major  injury,  25.23Z 

4 - Fatality,  14.02Z,  and 

5 - Lost  and  Unknown,  6.54Z 

The  above  information  is  summarized  in  Table  4-3. 


TABLE  4-3 

INJURY  CLASSIFICATION  WITH  PERCENTAGES 
(A-6  Operational  Ejection  Related  Injuries  During  1969-1975) 


Injury  Classification 

Injuries 

Percentage 

1 - No  injury /Minimal 

20 

18.69 

2 - Minor  injury 

38 

35.52 

3 - Non-fatal  major  injury 

27 

25.23 

4 - Fatality 

15 

14.02 

5 - Other  (Lost  and  Unknown) 

7 

6.54 

4.2.1  The  Binomial  Probability  Density  Function 

To  consider  an  example  of  use  of  the  binomial  probability 
density  function,  assume  that  the  probability  of  the  occurrence  of  a known 
fatality,  on  an  e jection-to-e jection  basis,  is  0.10.  This  assumption  is  not 
unreasonable  for  the  following  reasons:  (1)  over  the  time  interval  1 January 

1969  through  31  December  1975,  the  MORs  catalogued  1,069  ejections  from  all 
aircraft,  and  (2)  over  the  same  time  period,  there  were  117  known  fatalities. 
Thus,  the  relative  frequency  of  occurrence  of  a fatality  is  equal  to  0.109. 

Now  make  the  following  definitions: 

p ^ = 0.1  = probability  of  a fatality  on  each  ejection. 

1 - Pf  = 0.9  ® probability  of  other  than  a fatality  on  each  ejection. 

Suppose  now  a sample  consisting  of  30  ejections  is  available 
for  analysis.  What  is  the  probability  that  exactly  3 of  the  ejections  will 


s3  ... 
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result  in  fatalities?  Using  equation  (4.3)  and  the  above  a priori  probabil- 
ities from  ejection-to-e jection,  it  is  seen  that: 


,1)3  (0.9)30"3 


0.236087 


Using  this  information,  together  with  the  additional  assumption  that  future 
ejections  occur  under  conditions  similar  to  those  which  have  occurred  in  the 
past,  how  many  fatalities  can  be  expected  to  occur  during  the  course  of  the 
next  50  ejections?  If  the  assumption  is  made  that  f(x)  for  this  problem  is 
the  same  as  f ( 3 ) for  the  preceding  problem,  then 


f 2 (x)  = 0.236087 


(4.1) 


Equation  (4.1)  can  be  written  as  follows: 


Mx)  = 45.8086 


_ / - 50!  \ /lY 

lx!  (50  - x)!  J \ 91  » 


(4.2) 


where  it  is  desired  to  find  an  x such  that  A(x)  is  minimized.  From  equation 
(4.2),  Table  4-4  is  derived: 

TABLE  4-4 

VALUES  OF  A(x)  VERSUS  x 


40.253 

25.129 

18.922 

10.707 

9.927 

15.907 

24.925 


Not  surprisingly,  we  should  expect  5 fatalities  during  the  next  50  ejections, 
where  the  ejections  are  conducted  under  conditions  comparable  to  those  under 
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which  ejections  have  been  taking  place  during  the  recent  past.  Note  that 
x ° 5 is  the  value  of  x which  minimizes  A(x).  A graph  of  A(x)  versus  x is 
shown  in  Figure  4-1. 
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FIGURE  4-1.  GRAPH  OF  A(x)  VERSUS  x 

Analyses  similar  to  the  above  could  be  performed  on  the 
injury  categories  entitled  1 - no  injury,  2 - minor  injury,  and  3 - non-fatal 
major  injury. 

As  another  example  of  application  of  the  binomial  prob- 
ability density  function,  consider  the  dichotomous  injury  data  presented  ear- 
lier, namely  1 = neck  injury  and  0 = no  neck  injury.  Underlying  this  binary 
injury  designation  is  the  idea  of  a probability  density  function.  Thus,  there 
is  a probability  measure  attached  to  each  designation.  For  example,  it  seems 
clear  to  say  that  if  an  ejectee  will  have  a neck  injury,  with  a given  prob- 
ability, then  he  will  have  no  neck  injury  with  a probability  equal  to  one 
minus  the  probability  that  he  will  have  a neck  injury.  In  symbols,  let 

p = probability  of  neck  injury,  then 
q = (1  - p)  = probability  of  no  neck  injury. 

Immediately  these  relate  to  the  binomial  probability  density  function  defined 
as  follows: 
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where  f(x)  ■ binomial  probability  density  function 


binomial  coefficient 


n! 

x!  (n  - x) ! 


p = probability  of  occurrence  of  an  event  with  a given  attribute,  A, 

q = (1  - p)  * probability  of  non-occurrence  of  the  event  with  a given 
attribute,  A, 

x = number  of  occurrences  observed 

n = number  of  experiments  performed  during  the  course  of  which  x 
occurrences  were  observed 


To  relate  this  binomial  probability  density  function  to 
the  example  discussed  in  Section  3.1,  let  the  a priori  probability  of  occur- 
rence of  a neck  injury  be  defined  as  the  relative  frequency  of  occurrence 
of  a neck  injury.  Then, 


,V*Tf 


Again,  it  should  be  stated  that  these  numbers  were  generated  using  the  relative 
frequency  as  the  a priori  probability  p = 3/17  of  a neck  injury. 

These  a priori  probability  measures  can  doubtless  be  refined 
by  Bayesian  techniques,  if  sufficient  injury/no  injury  data  are  available. 

This  possibility  will  be  discussed  in  more  depth  presently. 


To  summarize,  it  is  believed  imperative  that  some  parent 
probability  density  function  be  associated  with  a binary  injury/no  injury 
data  analysis  procedure. 

4.2.2  The  Poisson  Probability  Density  Function 

As  mentioned  previously,  the  binomial  probability  density 
function  is  defined  by  the  equation 


ffc  (n;  x,  p) 


x .n-x 

P (1  - p) 


(4.3) 


in  which 


n = number  of  occurrences  of  an  event 
x = number  of  successes  in  a given  experiment 

p = probability  of  success  on  a given  trial  within  the  experiment. 


When  n is  large,  direct  calculation  of  probabilities  using  equation  (4.3) 
involves  an  enormous  amount  of  calculation.  As  an  example,  suppose  it  is 
known  a priori  that  the  probability  of  an  individual  getting  frostbitten  on 
a cold  winter  day  while  attending  a sporting  event  is  0.001.  What  is  the 
probability  that  27  persons  in  a crowd  of  30,000  will  get  frostbitten?  Di- 
rect substitution  into  equation  (4.3)  yields: 


fb(30,000;  27,  0.001)  = ^30;^00^  (0-001)27  (1  - 0.001) 


29,973 


(4.4) 


Considerable  effort  would  be  required  to  solve  equation  (4.4). 


If  the  observation  is  made  that  n is  large  and  p is  small 
such  that  np  = constant  as  n + “ and  p ■*  0 then  the  Poisson  probability  func- 
tion can  be  useful.  Let  np  = X,  a constant  as  n + ® and  p -*•  0.  Then  write 
equation  (4.3)  as  follows: 


ffa  (n;  x,  p) 


(n)  (n  - 1)  (n  - 2)  ...  (n  - x 


m-5> 


Write 


(■-if  - (-if1" (-if 


then  substitute  into  equation  (4.5)  to  get: 

( 


f.  (n;  x,  p) 


X! 


( - if  f ( - if 


Take  the  limit  as  n+®  and  X = constant  to  get: 


f (x;  X) 


(s)  (•-) 


which  is  the  Poisson  probability  density  function. 


(4.6) 


ease.  Thus, 


The  preceding  problem  can  now  be  worked  with  considerable 


np  = (30,000)  (0.001)  = 30 
x = 27. 


- t 

c 

K . 

1 
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Substitute  into  equation  (4.6)  to  get: 


fp  (27;  30)  = e~30 


f (27;  30)  = 0.06553 

P 


Statistical  examples  of  use  of  the  Poisson  probability  den- 
sity function  are  sometimes  called  rare  events  and  occur  in  widely  different 
fields.  For  example,  the  number  of  individuals  born  blind  per  year  in  a large 
city;  the  number  of  organisms  of  a given  size  S on  a glass  slide  that  escape 
death  by  X-rays  after  having  been  exposed  for  t-seconds;  the  number  of  times 
in  a year  that  the  volume  of  shares  traded  on  the  New  York  Exchange  will  ex- 
ceed M-million;  number  of  peak  loads  in  telephone  traffic  during  a given  period 
of  time,  etc. 

To  apply  the  Poisson  probability  density  function  to  the 
Aircrew  Automated  Escape  System,  suppose  it  is  known  that  5000  propulsion 
actuated  escape  devices  have  been  manufactured.  What  is  the  probability  of 
there  being  2 devices  among  the  5000  that  will  fail  to  fire,  given  the  prob- 
ability of  an  individual  failing  to  fire  as  0.005?  Here,  n = 5,000,  p = 0.005, 
X = np  = 25,  x = 2.  Substitute  into  equation  (4.6)  to  get: 


2 -25 

f (2;  25)  = = 0.0000000043. 

P 21 


There  is  thus  a very  small  chance  that  exactly  two  devices  out  of  the  5,000 


A question  sometimes  asked  is  the  following:  What  is 

the  probability  of  there  being  more  than  two  devices  that  will  fail  to  fire? 
To  answer  this  question,  examine  the  cumulative  Poisson  distribution  function. 
Thus , 


j 


Prob  (x  = 0 or  1 , or  2)  = f(x; 


From  equation  (4.6), 


Prob  (x  = 0 or  1 or  2)  = e 1 ♦ 25  ♦ 


= 0.0000000047 


Hence, 


P (x  > 2)  = 1 - Prob  (x  = 0,  or  1,  or  2), 


P (x  > 2)  = 0.9999999953. 


Thus,  it  is  seen  that,  under  the  conditions  assumed,  there  is  a very  high 
probability  that  more  than  two  devices  will  fail  to  fire.  If  p = 0.0001  above, 
then  P (x  > 2)  = 0.01439.  This  points  out  the  necessity  for  high  reliability 
in  each  hardware  component. 


4.2.3 


The  Multinomial  Probability  Density  Function 


The  binomial  experiment  becomes  a multinomial  experiment 
if  each  trial  can  have  more  than  two  possible  outcomes.  In  a general  case, 
if  a given  trial  can  res  in  any  one  of  k possible  outcomes  E^,  E^, 

...  , E^,  with  probabilit it  p^,  p^,  ...  , pk  then  the  probability  density 


function  of  the  random  variables  x^,  Xj,  ...  , x^,  representing  the  number  of 


occurrences  for  the  events  , E^,  ...  , E^,  in  n independent  trials  is  de- 


fined by  the  equation: 


■ 





- % / n \ X1 

fm  (x;  p;  n)  = k,  x2>  ...  , xj 


(4.7) 


Equation  (4.7)  may  be  written  compactly  as  follows: 


fm  ^x;  p; 


i x- 

k i 
— P,- 


■■  n s 


(x.)! 


(4.8) 


where 


x ~ x 


1 ’ X2  ’ 


* V XI  Xi  = n’ 


-*■ 

P = P 


1»  P2» 


> pk;  XPi  = 


(x)=  Xl!  X2!  • 


x 1 ’ 

V 


(4.9) 


A partial  graph  of  a trinomial  probability  density  function  is  shown  in 
Figure  4-2. 


To  relate  the  multinomial  probability  density  function 
to  the  AAES  ejection  related  injury  problem,  refer  to  data  in  Table  4-3  which 
contains  the  injury  history  of  ejectees  from  the  A-6  aircraft  over  the  time 
span  1 January  1969  through  31  December  1975.  Using  the  relative  frequency 
of  occurrence  of  an  injury  as  its  individual  probability  of  occurrence,  what 
is  the  probability  of  obtaining  exactly  the  injury  pattern  shown  in  Table  4-3? 
Under  the  assumptions  made,  = 0.1869,  P2  = 0.3552,  P^  = 0.2523,  P^  = 0.1402, 
and  Pj  = 0.0654.  Also,  = 20,  x2  = 38,  x^  = 27,  x^  = 15  and  x,_  = 7. 


Substitute  these  numbers  into  equation  (4.8)  to  get: 


fm  (x;  p;  n)  = 


20!  38 


i^7^,  7,  [(0.1869)20  (0.3552)' 


•(0.2523)27  (0.1402)15  (0.0654)7J 


f (x;  p;  n)  = 0.0001738;  n = 107 
m 


This  low  probability  is  not  surprising  since  a prescribed  injury  pattern  was 
given . 

4.3  Bayesian  Statistical  Theory  and  Applications 


Bayesian  statistical  theory  can  be  used  to  refine  prior  probability 
estimates  if  sufficient  data,  past  and  current  exist.  If  such  data  are  sub- 
jected to  the  Bayesian  algorithm,  posterior  probability  estimates  are  calcu- 
lated which  are  more  representative  of  reality  than  are  the  prior  probability 
estimates.  Bayesian  techniques  will  be  applied  to  appropriate  AAES  data  sets 
so  that  refined  probability  estimates  can  be  inferred. 

The  statement  of  Bayes'  Theorem  and  proof  are  very  short.  Both 
are  given  below.  [2,  pg.  598] 

Theorem: 

Let  B|>  B2’  ■'*  » Bn  Be  a mutually  exclusive  and  exhaustive  set  of  events. 

Let  E be  another  event  that  occurs  if  and  only  if  one  of  the  events  B^  has 

occurred.  Let  B.  be  that  event.  Then, 
i ' 


P(B . ) P (El  B.) 

P (B.  | E)  = - - 

i 1 n 

ys  P (B j ) p (E|  Bj) 


(4.10) 


cv. . '-dm. . 


Proof : 


Use  the  general  multiplication  rule  in  both  its  forms,  thus: 


P (B.  E)  = P (Bi>  P (E|  B.) 


P (B.  E)  = P (E)  P (Bj  E) 


(4.11) 


(4.12) 


Due  to  the  fact  that  the  character  of  B^  is  mutually  exclusive  and  exhaustive, 


P (E)  = P (Bj)  P (E | Bj) 

j = l 


Write  equation  (4.12)  thus 


P (B.|  E)  = 


P (B.  E) 
P (E) 


Now  substitute  equations  (4.11)  and  (4.13)  into  equation  (4.14), 


(4.13) 


(4.14) 


P (B.l  E)  = — 
i 1 n 


P (Bi)  P (E|  Bi) 


p (b.)  p (e|  b^ 


■ Vs  . 

I 


which  is  equation  (4.10)  and  proves  the  theorem. 


0i 

I 1 2 

l «•  « 


To  use  Bayesian  Statistical  Theory  in  an  analysis  of  the  AAES  ejec- 
tion/fatality problem  consider  the  following  situation:  A study  was  performed 

on  the  A-6  e jection/fatality  problem.  A sample  of  operational  ejections  from 
the  A-6  aircraft  was  studied  each  year  for  the  past  six  years.  The  fraction 
of  fatalities  occurring  during  each  year,  here  defined  as  the  ratio  of  fatali- 
ties to  ejections,  or  fatality  rate,  is  shown  as  column  4 in  Table  4-5.  The 
actual  number  of  fatalities  year-by-year  is  listed  in  column  3.  The  relative 
frequency  of  occurrence  of  fatalities,  here  designated  as  the  a priori 
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probability  of  occurrence  of  fatalities,  is  displayed  in  column  5.  Three  fa- 
talities have  occurred  during  the  course  of  16  ejections  in  the  current  year. 
Use  this  information  together  with  Bayesian  Statistical  Theory,  to  develop  im- 
proved numbers  for  the  a priori  probability  of  occurrence  of  fatalities. 


TABLE  4-5 

ANALYSIS  OF  A-6  EJECTION  RELATED  FATALITIES 


YEAR 

EJECTIONS 

FATALITIES 

FRACTION 

OF 

FATALITIES 
(Col. 3 * Col. 2) 

RELATIVE  FRE- 
QUENCY (PRIOR 
PROBABILITY) 
(Col. 3 t I Col. 3) 

(1) 

(2) 

(3) 

(A) 

(5) 

1969 

14 

5 

0.3571 

0.2778 

1970 

17 

3 

0.1765 

0.1667 

1971 

22 

0 

0.0000 

0.0000 

1972 

18 

5 

0.2778 

0.2777 

1973 

10 

2 

0.2000 

0.1111 

1974 

8 

3 

0.3750 

0.1667 

TOTAL: 

89 

18 

0.2022 

1.0000 

Now,  make  use  of  the  numbers  in  Table  4-5,  together  with  Bayesian  Statistical 
Theory  to  construct  Table  4-6.  The  purpose  of  Bayesian  statistical  analysis 
is  to  use  current  information  to  refine  previously  derived  results. 

If  suitable  data  exist,  Bayesian  statistical  theory  appears  useful 
in  the  analysis  of  dichotomous  ejection/ injury  problem  area  of  this  AAES  study. 

As  mentioned,  it  is  a very  useful  algorithm  for  refining  a priori  probability 
estimates  of  the  occurrence  of  various  events. 

4.4  Higher  Order  Contingency  Table  Data  Analysis 

Some  information  about  contingency  tables  was  developed  in  Section  3.2. 
Here,  an  example  of  a higher  order  contingency  table,  and  analysis  which  ac- 
companies it,  will  be  presented.  Data  for  the  work  that  follows  were  taken 
from  [l , pg.  28].  These  data  are  based  on  actual  combat  ejection  observations. 
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for  this  sample  of 
Ejections . 


Ejection/injury  data  for  six  high  performance  aircraft  are  presented 
in  Table  4-7.  Here,  it  should  be  noted  that  two  injury  categories  are  studied: 
(1)  major  injuries,  and  (2)  minor  injuries. 


TABLE  4-7 

COMBAT  EJECTION  INJURIES  VERSUS  AIRCRAFT  TYPE 


A-7/ 

RA-5C/ 

TOTAL: 

■S 

A-4/ 

A-6/ 

ESCAPAC 

F-4/ 

F-8/ 

HS-1/ 

ESCAPAC 

Mk  GRU-7 

IC-3 

Mk  H-7 

Mk  F-7 

HS-1A 

Major 

10 

4 

3 

n 

5 

1 

30 

Minor 

19 

2 

4 

El 

8 

1 

45 

TOTAL : 

29 

6 

7 

18 

13 

2 

75 

From  the  data  in  Table  4-7,  the  following  null  hypothesis  is  proposed: 


H : Ejection  injury  is  independent  of  aircraft/seat  type  from 


which  the  ejection  was  made. 


Consider  now  the  universe  of  75  ejection  related  injuries  experienced 
in  combat  by  aircrewman  upon  ejection  from  the  six  different  aircraft  shown  in 
Table  4-7.  The  probability  that  a particular  injury,  from  this  universe  of  in- 
juries, occurred  upon  ejection  from  an  A-4  aircraft  is  P(A^).  Numerically, 
from  Table  4-7,  F(A^)  = 29/75. 


Similarly, 


p<a6) 


75  * P(V 


75  ; P(V 


— • P(F  ) = — • 
75  * 75  ’ 


and 


p(A5)  = f-5 
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Likewise  from  Table  4-7, 


30 


P(M)  = yy  = Probability  of  a major  injury 


45 

P(m)  = yy  = Probability  of  a minor  injury 


Under  the  hypothesis  H , that  is,  ejection  injury  is  independent  of 


aircraft/seat  type, 


P(A^^  M)  = P(A^)  P(M)  = Probability  acquiring  a major  injury 

upon  ejection  from  an  A-4  aircraft. 


Thus, 


P(A4n  M) 


mm- 


Similarly, 


A. 

< • 

V-  : 

t 


i r: 

I -•  ji 


. m 

t 

\ r 


P(A6  n M) 

- p(a6)p(m) 

•W(») 

P(A?n  M) 

■ «(») 

= 0.037 

P(F4n  M) 

■ (*)(#) 

= 0.096 

P(Fg  H M) 

■(«)(«> 

0.069 

P(A5  n M) 

0.011 

.032 


Analogously,  under  Hq, 


P(A4  n m)  = P(A4>  P(m)  = Probability  of  acquiring  a minor  injury  upon 

ejection  from  an  A-4  aircraft. 
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t/*n> 


Em(A?)  = (75)  (0.037)  = 2.8  = 3 


A 


i 


Likewise,  the  expected  number  of  minor  injuries  is  found  as  follows: 

E (A.)  = (75)  (0.232)  = 17.4  = 17 
m 4 

E (A,)  = (75)  (0.048)  = 3.6  = 4 

m o 

E (A,)  = (75)  (0.056)  = 4.2  = 4 

E (F.)  = (75)  (0.144)  = 10.8  = 11 
m 4 

E (Fq)  = (75)  (0.104)  = 7.8  = 8 

m o 

Em(A5)  = (75)  (0.016)  = 1.2  - 1 


From  these  data,  based  on  actual  combat  ejection  observations, 


Table  4-8  is  constructed. 


TABLE  4-8 


COMPARISON  OF  OBSERVED  FREQUENCY  VERSUS  THEORETICAL  FREQUENCY  OF  COMBAT 
EJECTION  INJURY  OCCURRENCE  (Theoretical  Frequency  is  Shown  in  Parenthesis) 


^^^Aircraft/  A-7/ 

pN.  Seat  A-4/  A-6/  ESCAPAC  F-4 / F-8/ 

Injury  \ ESCAPAC  Mk  GRU-7  IC-3  Mk  H-7  Mk  F-7  I HS-1A 


Maj  or 
Minor 


7(7)  5(5)  1(1)  30 

11(11)  8(8)  1(1)  45 


29(29)  6(6) 


7(7)  18(18)  13(13)  2(2)  75(75) 


The  proposed  hypothesis  Hq  is  tested  with  the  Chi-square  statistic. 


Xc/d. f .=5 


6 2,  ,2 

e e r-c: 


(4.15) 


c=l  r=*l 


.-j-a  M . :*  v • - .max? 


•**1  - *» 


V 


K' . 

r , 


= degrees  of  freedom,  where  v = (r  - 1) 

*(c  - 1),  here  r = 2 c = 6,  hence 
v = 5 degrees  of  freedom.  Note:  r * rows, 

c = columns. 


o = the  observed  frequency  at  the  "point  r,  c, 
r,C  in  Table  4-8. 


c = the  expected  frequency  at  the  "point  r,  c. 


From  Table  4-8,  the  Chi-square  statistic  is  easy  to  compute.  Thus, 


2 (10-12)2  (4-2) 2 (3-3)2  , (7-7) 2 

Xc/d.f.=5  12  237 


. (5-5) 2 . (l-l)2  , (19-17)2  (2-4)2 

5 1 17  4 


K 

t 


Cc/d.f.=5 


. (4-4) 2 (ll-ll)2  , (8-8) 2 (l-l)2 

4 11  + 8 1 


= 3.569 


From  Tables  of  the  Chi-square  statistic  for  v = 5 degrees  of  freedom 
at  a = 0.05, 


X0.05/d.f.=5  11,07 


2 2 

Since  xc  < Xg  g^  for  5 degrees  of  freedom,  accept  the  null  hypothesis 
at  the  a = 0.05  level  of  significance  and  conclude  that  ejection  injury  and  the 
aircraf t/seat  type  from  which  the  ejection  was  made  are  independent.  This  Chi- 
square  test  was  performed  without  applying  the  Yates'  correction  factor.  Ap- 
plication of  Yates'  correction  factor  yields 


s 

t . 

Ufa 

S’ 


Xc/d.f .=5  = 3,°27’ 


and  the  null  hypothesis  is  still  accepted. 
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Contingency  table  data  analysis  has  been  demonstrated,  on  the  Bal- 
listic Spreader  Gun  problem,  to  be  a very  powerful  tool  in  the  analysis  of 
ejection  seat  data. 

In  the  interest  of  some  generality,  an  extension  of  the  contingency 
tables  presented  thus  far  now  will  be  given.  The  generalization  will  be  to  a 
contingency  table  having  m-rows  and  n-columns.  It  will  be  known  as  an  m x n 
contingency  table.  An  example  is  displayed  in  Table  4-9. 

Hypothesis  testing,  using  the  Chi-square  statistic  in  conjunction 
with  the  m x n contingency  table,  is  an  extension  of  equation  (3.7)  which  is 
used  to  perform  hypothesis  testing  with  a 2 x 2 contingency  table.  The  equa- 
tion follows. 


n 2 
a. 


C,  N_  Y'  I»1 

' d.  f . = (m-l)  (n-1)  N Z-f  M 

1 » a ^ m , i 


n 2 

Sir 


, N_  Y 2,i 

N0  7 j M , 

2 ,n  " m,  i 


m n 


m,n 


i=l 


m,  l 


M 


- N 


m.i 


(4.16) 


In  compact  notation,  equation  (4.16)  is  written: 


2 

X/ 


d.f .«(m-l) (n-1) 


m 

r 

l 

n 

E 

N 

E 

F 

w 

9 

j-i 

i=l 

- N 


(4.17) 


The  coefficient  of  contingency  can  be  found  by  using  equation  (3.9). 


~a. . 
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It  may  happen  that  small  values  are  given  for  both  m and  n.  In  £his 

case,  equation  (4.15),  and  more  generally  equation  (4.17),  represent  only- 

approximations  to  the  Chi-square  statistic.  Thus,  a correction,  due  to  Yates' 

should  be  applied.  As  was  suggested  by  Yates',  [7,  pg.  230;  pg.  259],  the 
2 

approximation  to  x is  improved  by  replacing  one  cell  frequency,  say  a.  . by 

1 9 J 

a.  . + 1/2,  and  adjusting  the  other  elements  in  Table  4-9  so  as  to  keep  the 
J 

marginal  totals  unaltered.  Writing  an  alternate  form  of  equation  (4.17)  as 

follows  (here  let  a.  . = observed  frequency  in  the  i,jt^1  cell): 
l , j 


- E E ^ 

i=l  j=l  ,J 


d.f .=(m-l) (n-1) 


(4.18) 


Application  of  the  Yates'  correction  to  equation  (4.18)  yields  the  following: 


a , , - e.  . - 0.5 


d.f .=(m-l) (n-1) 


i=l  j=l  1,J 


(4.19) 


where  from  Table  4-9, 


N.  M 
i,n  m,  l 


(4.20) 


and  e.  ^ = expected  frequency  in  the  i,jt  cell, 

n = total  of  observed  frequencies  in  the  i^  row 
^ = total  of  observed  frequencies  in  the  j*"*1  column. 


4 . 5 Non-parametric  (Distribution  Free)  Statistical  Tests 


Several  methods  have  been  developed  recently  which  make  it  possible  to 
judge  the  randomness  of  observed  data  on  the  basis  of  the  order  in  which  the 
observations  are  obtained.  Thus  a test  can  be  performed  to  determine,  within 
probabilistic  limits,  whether  the  data  contain  patterns  that  look  suspiciously 
non-random.  It  is  interesting  to  note  that  this  test  can  be  applied  after  the 
data  are  collected.  The  technique  is  based  on  the  theory  of  runs.  This  is 
a non-parametric  statistical  technique.  Another  non-parametric  technique,  to 
be  presented  shortly,  is  the  trend  analysis  of  data. 

4.5.1  The  Run  Test 


A run  is  a succession  of  identical  symbols  which  is  fol- 
lowed and  preceded  by  different  symbols  or  no  symbols  at  all.  To  illustrate, 
let  n = non-defective  pieces  and  d = defective  pieces  produced  by  a given 
machine.  Suppose  the  pieces  manufactured  by  the  machine  form  a run  as  follows: 

(n,  n,  n,  n,  n,)(d,  d,  d,  d,)(n,  n,  n,  n,  n,  n,  n,  n,  n,  n,)(d,  d,) 

(n,  n,)(d,  d,  d,  d,)(n,)(d,  d,)(n,  n,) 

Here,  there  are  5 non-defectives,  followed  by  4 defectives,  etc. . .finally  2 
defectives  followed  by  2 non-defectives.  Each  set  of  symbols  in  parenthesis 
represents  a run.  In  this  example,  there  are  u = 9 runs,  n^=  20  non-defectives 
and  n£=  12  defectives. 

The  total  number  of  runs  appearing  in  an  arrangement  of 
this  kind  is  often  a good  indication  of  a possible  lack  of  randomness.  If 
there  are  too  few  runs,  a definite  grouping  or  clustering  might  be  suspected, 
perhaps  even  a trend.  If  too  many  runs  are  observed,  some  sort  of  repeated 
alternating  pattern  might  be  suspected. 

The  probability  density  function  for  the  distribution  of 
u-runs  is  [5,  pg.  353]  : 


In  equations  (4.21)  and  (4.22), 
u = the  number  of  runs, 

n^  = the  number  of  observations  having  a given  attribute,  1,  say, 

n 2 = the  total  number  of  observations  less  the  number  of  observations, 
n^,  having  a given  attribute,  1,  say. 

In  this  example,  it  should  be  reiterated  that  a dichotomous 
situation  is  under  observation.  That  is,  an  event  either  has  a given  attribute, 
or  it  does  not  have  that  attribute. 


To  consider  a concrete  example  applicable  to  the  AAES,  a 
sample  of  fifty  A-4  operational  ejections  was  observed,  and  injuries  classi- 
fied according  to  the  injury  classification  scheme  shown  in  Table  4-3.  The 
following  sequence  of  observations  was  made. 


It  was  decided  to  partition  the  injuries  into  two  sets:  (1)  classifications 

1,  and  2,  that  is  no  injuries  and  minor  injuries,  and  (2)  classifications  3, 

4,  and  5 — major  injuries,  fatalities  and  other.  This  is  now  a dichotomous 

situation.  Let  n represent  the  first  set,  and  i represent  the  second  set. 

The  preceding  run  can  now  be  written  in  dichotomous  notation  as  follows: 
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n. 

n. 
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From  these  data,  the  following  number  are  derived: 
n^  = non-major  injured  = 37 

n2  = major  injured,  including  fatalities  = 13 
u = number  of  runs  = 19 

Expressions  for  the  mean  and  variance  of  u when  n^  and  n2  both  exceed  10  are 
defined  as  follows  [5,  pg.  354]: 


2 nln2 

E(u)  = + 1 * 
n + n 


(4.23) 


and 


Var(u)  = 


2 n^  n2  (2  n^  - n^  - n^) 
2 

(n1  + n2)  (n^  + n2  - 1) 


(4.24) 
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Substituting  from  above  for  and  n£ 


E(u) 


(2)  (37)  (13) 
37  + 13 


20.24 


Var (u) 


(2)  (37)  (13)  f(2)  (37)  (13)  - 37  - 13] 
(37  + 13) 2 (37  +13-1) 


Var(u)  = 7.161992 
\/Var(u)  - 2.67619 


The  null  hypothesis  being  tested  is  the  following: 

H : The  dichotomous  A-4  injury  data  shown  in  Table  4-10 
is  comprised  of  two  random  samples  of  n's  and  i's 
which  have  the  same  parent  population  density  function. 

This  hypothesis  is  to  be  tested  at  the  a = 0.05  level  of  significance  with  the 
statistic 


z . u - E(u) 
C V Var(u) 


(4.25) 


under  the  assumption  that  n^  and  ^ are  sufficiently  large  (greater  than  10)  so 
that  the  sampling  distribution  of  u can  be  approximated  with  a normal  density 
function.  Substituting  numbers  generated  into  equation  (4.25)  yields: 


z 

c 


19.0  - 20.24 
2.67619 


= - 0.46335 


Since  z ,0  = zn  = 1.96,  it  is  clear  that  - 1.96  < - 0.46335  < + 1.96,  thus 
a/2  0.025  * 

the  null  hypothesis  is  accepted  at  the  a = 0.05  level  of  significance,  and  we 
conclude  that  both  samples  are  extracted  from  the  same  parent  population  dens- 
ity function. 
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It  would  be  interesting  to  compute 


f (u)  from  equation  (4.21). 


Thus,  using  the  numbers  developed  above,  that  is , u * 19  = 2k  + 1 , hence  k ■ 
nx  = 37,  and  n2  * 13.  Thus  the  probability  of  observing  exactly  19  runs  is 


f (19)  = 0.150082 

The  familiar  concept  of  "level-of-signif icance"  is  illus- 
trated, for  this  particular  problem,  by  Figure  4-3.  Since  n^  and  n2  are  both 
greater  than  10,  the  sampling  distribution  of  u is  assumed  to  be  normal.  Re- 
call that  "level-of-signif icance"  is  the  probability  of  a Type  I error.  It  is 
the  probability  of  rejecting  an  hypothesis  when  in  fact  the  hypothesis  is  true. 


where  z is  standardized  by  equation  (4.25).  Clearly, 


f(z)  dz  = / f (z)  dz  + 


f(z)  dz 


= 0.5000  + 0.4750  = 0.9750 


Hence 


f(z)  dz  = 


■L 


-1.96 


f (z)  dz  = 1.0  - 0.975 


Thus, 


-1.96 


f(z)  dz  = 0.025. 


For  a two-tail  test: 


*'/ 


-1.96 


f(z)  dz  + / f(z)  dz  = 0.025  + 0.025 


Thus,  a ■ 0.05.  Now,  since  z^  = - 0.46335,  clearly 


- 1.96  < z < + 1.96, 
c 


that  is,  zc  falls  in  the  acceptance  region.  Hence  at  the  a = 0.05  level  of 
significance,  the  null  hypothesis  is  accepted. 


TPf %> 


•5.2  Tests  on  Runs  Above  and  Below  the  Median 

The  theory  of  runs  above  and  below  the  median  also  may  be 
used  to  test  the  hypothesis  that  observations  have  been  drawn  at  random  from  a 
single  population.  To  perform  this  experiment,  find  the  median  of  the  sample 
and  denote  observations  above  the  median  by  an  a,  and  those  below  the  median 
by  a b.  If  the  number  of  runs  of  a's  and  b's  is  larger  or  smaller  than  might 
be  expected  by  chance,  reject  the  hypothesis  that  the  observations  have  been 
drawn  at  random  from  a single  population. 

From  the  data  in  Table  4-10,  compute  the  mean  value,  as- 
sume it  is  the  same  as  the  median,  then  form  Table  4-11  in  keeping  with  whether 
the  values  in  Table  4-10  are  above,  a,  the  median,  or  b,  below  it.  The  mean 
of  data  in  Table  4-10  is  x = 98/50  = 1.96.  Table  4-11  is  easy  to  construct. 

TABLE  4-11 

DEFINITIVE  CATEGORIES  OF  A-4  EJECTION  INJURY  DATA 

b,  b,  a,  a,  a,  b,  a,  a,  a,  a,  b,  a,  a,  b,  b,  a,  a, 

a,  b,  a,  a,  a,  a,  b,  a,  a,  b,  b,  a,  b,  a,  b,  b,  a, 

a,  b,  a,  b,  a,  b , a,  a,  b,  a,  b,  b,  b,  b,  b,  b. 

In  Table  4-11, 


n^  = 27,  n£  = 23,  and  u = 27. 


where  n^  = number  above  median,  a 
= number  below  median,  b 
u * number  of  runs 

From  equations  (4.23)  and  (4.24): 


E(u) 


(2)  (27)  (23) 
27  + 23 


25.84 


Var (u) 


(2)  (27)  (23)  \2  (27)  (23)  - 27  - 23] 
(27  + 23)2  (27  +23-1) 


Var(u)  = 12.085 
\/Var (u)  = 3.4764 


Proceeding  as  before,  compute  z^ 


u - E(u) 
z = — 

c VVar  (u) 

27  - 25.84 
3.48 


z = 0.33369 
c 


Since  - zQ  Q25  < Z£  < Zq.025’  that  is’  since  " 1,96  < 0,33369  < + 1,96j 
hypothesis  of  randomness  cannot  be  rejected  at  the  level  of  significance 


the 
a = 


O.u 


and  we  conclude  that  the  sample  of  a's  and  b's  in  Table  4-11  have  been  drawn  at 


random  from  a single  parent  population. 


4.5.3  The  Trend  Test 


Consider  a given  data  set  x^,  i = 1,  2,  ...  , N.  For  ex 
ample  the  set  of  injury  data  shown  in  Table  4-10.  In  this  data  set,  count  the 
number  of  times  that  x±  > x f or  i < j . This  total  number  of  inequalities  or 
reverse  arrangements  contained  in  the  set  {x^}  is  equal  to  some  number  y.  If 
the  N data  points  are  independent  observations  of  the  same  variable  x^  then  y 
is  random  variable.  The  variable  y has  a mean  and  variance  given  by: 

= (N)  (N_-  11  (4.27) 

y * 
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and 


i 


2 

a 

y 


2N3  + 3N2  - 5N 
72 


(4.28) 


If  N is  sufficiently  large,  then  y is  approximately  normally  distributed  with 
mean  and  variance  defined  by  equations  (4.27)  and  (4.28).  Under  these  con- 
ditions, a two-tail  level  of  significance  test  can  be  performed  on  the  number 
of  observed  trends,  here  defined  as  reverse  arrangements,  detected  in  a given 
data  set.  Given  a significance  level,  a,  two  points  and  ^ can  be  deter- 
mined such  that 


Translate  the  normal  density  function  defined  by  equation 
(4.29)  to  one  defined  by  equation  (4.26)  by  the  linear  transformation: 


f 


In  the  event  a * 0.05,  the  above  reduces  to  (see  tables  of  the  standard  normal 
density  function,  i.e.,  u * 0,  a ■ 1.0): 

X X 

Nx  = - 1.96  oy  + uy  (4.33) 

N2  = 1.96  oy  + py  (4.34) 

Consider  now  the  data  set  shown  in  Table  4-10.  Here,  N = 50. 
Hence,  from  equations  (4.27),  (4.28),  (4.33)  and  (4.34) 

1 I 

, _ (50)  (59  - 1J-  = 612.5 

y H 

2 (2)(50)3  + (3) (50) 2 -(5) (50) 

y 72 

a2  = 3572.917;  a = 59.774 

y y 

Nx  « (-1.96) (59.774)  + 612.5  = 495.3 

N2  = (1.96)(59.774)  + 612.5  = 729.7 

Now  postulate  the  following  null  hypothesis: 

H : The  A-4  ejection  injury  profile  data  (Table  4-10)  contain 

° no  underlying  trend. 

- : 

I 

I * To  test  the  null  hypothesis,  compute  the  number  of  trends 

(reverse  arrangements)  in  the  injury  profile  data.  Then,  if 

I j ? N1  — (num^er  °f  observed  trends)  <_  N2 


*■  < 


accept  the  null  hypothesis  at 
it.  The  number  of  trends  was 


the  a level  of  significance.  Otherwise  reject 
computed  and  are  displayed  in  Table  4-12  below. 
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TABLE  4-12 


TREND  ANALYSIS  OF  A-4  EJECTION  INJURY  DATA 


(Sample  Extracted  from  All  A-4  Ejections 
Over  the  Time  Period  1969-1975) 


ni  < n < N2  , 

or, 

495.3  < 570  < 729.7  . 

Therefore,  at  the  a = 0.05  level  of  significance,  the  null  hypothesis  of  no 
underlying  trend  in  this  sample  of  A-4  ejection  injury  data  is  accepted. 

4.5.4  Fisher's  Exact  Test  as  Applied  to  Contingency  Tables 


The  Chi-square  goodness-of-f it  test  gives  a good  measure 
of  dependency  among  the  various  entries  in  a contingency  table  if  m and  n,  the 
row  and  column  dimensions,  respectively  are  large  and  if  the  number  of  ele- 
ments in  each  cell  (frequency  of  occurrence)  is  large.  If  the  number  of  ele- 
ments in  each  cell  is  small,  the  Yates'  correction  factor,  defined  by  equation 
(4.19)  should  be  applied  to  enhance  a realistic  measure  of  dependency.  As  an 
example,  consider  Table  3-5  repeated  here  in  complete  form  as  Table  4-13  for 
convenience. 

The  entries  in  parenthesis  in  Table  4-13  are  the  expected 
values  as  computed  from  equation  (4.20).  The  Chi-square  statistic,  computed 
by  equation  (4.18),  is 
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TABLE  4-13 


A 2 x 2 NECK  INJURY  CONTINGENCY  TABLE 


2 

The  value  of  x = 13.46.  Thus,  the  hypothesis  of  independence  between 

neck  injuries  and  updated  seats  is  rejected  at  the  a = 0.05  level  of  signif- 
2 

icance  where  = 3.841  for  one  degree  of  freedom. 

2 

This  value  of  xc  = 13.46  could  be  seriously  in  error. 
Therefore  Yates'  correction  factor,  defined  by  equation  (4.19)  is  applied. 


* :•% 

»■  i 


i 


This  corresponds  to  a level  of  significance,  found  by  linear  interpolation, 
of  a = 0.00216.  The  null  hypothesis  of  independence  between  neck  injuries 
and  updated  seats  still  would  be  rejected  at  the  given  level-of-signif icance; 
however,  the  numerical  value  for  the  Chi-square  statistic  is  more  realistic. 

For  a 2 x 2 contingency  table,  Fisher  [7,  pg.  230]  derived 
a method  for  obtaining  an  exact  probability  of  occurrence  among  entries  in  the 
contingency  table.  This  method  also  is  known  as  the  exact  distribution  for 
2x2  contingency  tables  [10,  pg.  96]. 


In  essence,  Fisher's  Exact  Test  takes  advantage  of  the 
Hyper geometric  density  function  defined  by  the  equation: 


f(x;  n,  £,  3) 


(4.35) 


This  discrete  density  function  is  associated  with  the  physical  problem  of 
sampling  without  replacement. 


To  apply  the  above  to  a 2 x 2 contingency  table,  consider 
the  table  shown  in  Figure  3-1.  Compare  contingency  table  entries  with  the 
symbols  in  equation  (4.35)  and  hence  establish  the  following  correspondences: 


f [a1;  (a1  + b^) , (a1  + a2) , 0^  + b2)]  = 


(4.36) 


Reduce  the  binomial  coefficients  to  factorials,  write  the  left  member  of 
equation  (4.36)  as  f(a^),  then  equation  (4.36)  becomes: 


fl  (al> 


(a^  + a2) ! (a^  + b^) ! (a2  + b2) ! (b^  + b2) ! 
a^!  a2!  b1 ! b2!  ^ + a2  + b.^  + b2)  ! 


(4.37) 
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The  right  member  of  equation  (A. 37)  is  invariant  under  a permutation  of  en- 
tries in  the  contingency  table.  For  example: 


f[b2;  (a 2 + b2> , (b1  + b2) , (ax  + a2> ] = 


(MC1:;) 


(al  + a2  + bl  + hl\ 

V a2  + b2  ) 


(4.38) 


The  right  member  of  equation  (4.38)  reduces  to  the  right  member  of  (4.37). 
Thus,  without  loss  of  generality,  assume  that  a^  = min  (a^ , a2>  b^,  b2) . 
Equation  (4.37)  gives  the  exact  probability  of  observing  the  two  fractions 
p^  = a^/ta^  + a2)  and  p2  = b^/tb^  + b2)  when  there  is  no  class  difference. 
That  is,  under  these  conditions,  the  true  dichotomy  for  each  class  is  in  the 
proportion  (a^  + b^)  : (a2  + b2) . To  obtain  the  final  probability  used  in 
determining  whether  a significant  difference  exists  between  p^  and  p2,  the 
probabilities  of  more  divergent  fractions  than  those  observed  must  be  added 
to  equation  (4.37).  The  next  more  divergent  situation,  assuming  p^  < p2>  is 
obtained  by  decreasing  a^  and  b2  and  increasing  a2  and  b^  each  by  unity. 
Thus,  analogous  to  equation  (4.37): 


f2  (ax  - 1) 


(a;L+a2)!  (a1+b1)!  (a^) ! (b^b^  ! 
(aj-l)!  (a2+l)!  (b^l)  ! (b2~l) ! (a1+a2+b1+b1) ! 


(4.39) 


A general  expression  can  now  be  written: 


where  i 


l 


» i 

f 

* 


I 


much  or 
sampled 


(a^+a2) ! (a^+b^) ! (a2+b2) ! (b^+b2) ! 

(ax  - i + 1)  (a^.i+D!  (a2+i-l)!  (bj+i-1) ! (b2-i+l) ! (a^+b^) ! ’ 

(4.40) 

■ 1.  2,  ...  , (aL  + 1) . 

Finally,  the  probability  of  observing  two  fractions  as 
more  divergent  than  p^  and  p2  when  there  is  no  difference  in  the 
population  is  given  by  [9,  pg.  96]: 

4-39 


arA.  » ... 


Z fi  <ai  - 


i + 1) 


(4.41) 


To  apply  Fisher's  Exact  Probability  Test  to  Table  4-13, 
let  a^  = 2,  a^  = 5;  b^  * 38,  and  b^  = 5.  Then  from  equation  (4.40),  when 

i - 1, 


fl  <al>  = 2!  5 ! 38 : 5l'  50i  = °-0019679 


Likewise,  from  equation  (4.40),  with  i = 2, 


(ax  - 1) 


7!  40!  10!  43! 
1!  6!  39!  4!  50! 


0.0000841 


f , 9s  , _ 7!  40!  10!  43! 

3 ^l  ’ 3 ( } 0!  7!  40!  3!  50! 


0.00000120 


From  equation  (4.41),  P = 0.002053,  corresponding  to  x = 10.0534,  or  for  a 
two-tail  test,  P = 0.004106.  Thus,  the  probability  of  the  observed  contin- 
gency Table  4-13  is  0.0019679,  and  the  probabilities  of  the  other  two  more 
extreme  cases  are  0.0000841  and  0.00000120,  respectively. 


From  the  above  results,  it  is  seen  that  Fisher's  exact 
probability  one-tail  test  (P  = 0.002053)  closely  approximates  the  Chi-square 
goodness-of-f it  test  to  which  Yates'  correction  factor  has  been  applied 
(P  = 0.00216);  however,  the  null  hypothesis  of  independence  between  neck  in- 
juries and  updated  seats  is  still  rejected.  To  summarize,  the  preceding  null 
hypothesis  is  rejected  at  the  a = 0.05  level  of  significance  by  the  following 
tests:  (1)  the  uncorrected  Chi-square  test  applied  to  a contingency  table, 

(2)  the  corrected  (Yates'  correction)  Chi-square  test  applied  to  a contingency 
table,  and  (3)  Fisher's  Exact  Test. 


I 


5.0  SUMMARY 


This  report  addresses  activities  under  Phase  I of  a four  phase  project 
to  analyze  AAES  data.  Specifically,  the  charter  was  to  establish  a statisti- 
cal analysis  methodology  which  could  be  applied  to  AAES  ejection  data.  It 
was  recognized  early  that  specific  analyses  may  need  to  be  modified,  depending 
on  available  data — modified  in  the  sense  that  some  analytic  techniques  might 
need  to  be  added,  others  eliminated.  In  addition,  an  outline  of  Phases  II, 

III  and  IV  were  to  be  included  in  this  Phase  I report. 

This  final  technical  report  covering  Phase  I activities  contains  an  ap- 
proach to  the  construction  of  a statistical  analysis  methodology  for  applica- 
tion to  AAES  ejection  data.  The  derived  statistical  techniques  are  applied 
to  actual  AAES  data.  Although  primary  emphasis  was  placed  on  injury  data, 
it  should  be  noted  that  the  displayed  techniques  are  data  independent.  That 
is,  they  can  be  applied  to  any  data,  in  general,  and  to  AAES  data,  in  partic- 
ular, that  are  amenable  to  statistical  analysis. 

The  report  is  heavily  illustrated  with  numerical  examples,  together  with 
interpretations.  In  several  instances,  confidence  measures  are  attached  to 
the  particular  results  obtained.  Again,  it  should  be  noted  that  actual  AAES 
ejection  data,  extracted  from  both  the  MORs  and  combat  information,  were  used 
throughout  this  report.  Application  of  the  techniques  to  actual  AAES  ejec- 
tion data  will  continue  in  Phase  II  of  the  investigation. 


To  reiterate  briefly,  the  direction  under  which  this  work  is  being  pursued, 


is  categorized  as  follows: 

• 

Phase  I. 

Develop  and  illustrate  a statistical  analysis  methodology 
applicable  to  AAES  ejection  data. 

• 

Phase  II. 

Apply  the  statistical  analysis  methodology  developed 
in  Phase  I to  a particular  AAES  ejection  seat. 

• 

Phase  III. 

Expand  the  specific  application  illustrated  in  Phase  II 
to  a wide  class  of  ejection  seats. 

Phase  IV. 


Based  on  findings  and  recommendations  in  Phases  II  and 
III,  outline  a system  for  monitoring  safety  aspects 
of  AAES. 


Proposed  procedures  for  Phases  II,  III  and  IV  of  this  project  are  out- 
lined in  Appendix  B,  Sections  9,  10  and  11,  respectively. 


, 
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APPENDIX  A:  ADDITIONAL  STATISTICAL  ANALYSES 


7.0  Continuous  Probability  Density  Functions 

Selected  discrete  probability  density  functions  were  mentioned  earlier. 

It  was  shown  that  the  binomial  density  function  is  useful  in  the  analysis 
of  a dichotomous  situation.  The  Poisson  is  useful  in  certain  applications 
of  the  binomial  density  function.  It  is  also  very  useful  in  many  analyses 
of  system  reliability  problems.  The  multinomial  probability  density  func- 
tion is  useful  if  a situation  can  assume  polychotomous  states.  Examples  illus- 
trating use  of  each  of  these  density  functions  were  displayed. 

Here,  a brief  presentation  will  be  made  of  three  useful  continuous  prob- 
ability, density  functions:  (1)  Normal,  (2)  Gamma,  and  (3)  Beta.  The  normal 

density  function  is  displayed  because  it  can  be  viewed  as  the  foundation  of 
modern  statistical  theory.  The  gamma  contains  the  exponential  density  func- 
tion as  a special  case,  hence  it  is  useful  not  only  in  classical  statistical 
analysis,  but  in  the  field  of  reliability  as  well.  An  example  of  goodness- 
of-fit  to  the  gamma  density  function  will  be  given.  Also,  an  example  of  nu- 
merically integrating  the  gamma  density  function  is  shown.  The  beta  density 
function  has  the  uniform  density  function  as  a special  case.  Further,  it 
evolves  quite  naturally  from  the  gamma  density  function.  The  beta  density 
function  is  useful  in  Bayesian  statistics. 


7 . 1 The  Normal  Probability  Density  Function 


* ' > 

* 
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The  normal  density  function  is  useful  in  data  analysis  because  it 
is  used  to  tell  whether  a particular  phenomenon,  reflected  by  the  data,  oc- 
curred randomly,  that  is,  by  chance.  If  the  event  occurred  randomly,  clearly 
there  was  a cause  underlying  the  occurrence,  and  that  cause  may  be  very  impor- 
tant. If  the  event  did  not  occur  randomly,  then  by  elimination,  there  is 
an  underlying  deterministic  reason  for  occurrence  of  the  phenomenon.  If  de- 
terministic causes  are  suspected,  more  intensive  analyses  need  to  be  performed 
to  discover  what  these  are.  After  discovering  the  causes,  recommendations 
need  to  be  made  to  management.  As  an  example,  the  normal  density  function 


FIGURE  7-1.  GRAPH  OF  THE  NORMAL  PROBABILITY  DENSITY  FUNCTION 


The  equation  of  a general  normal  probability  density  function  is 


f(x) 


o V2tT 


(7. 


where 

2 

a * variance  of  the  parent  density  function 
y * mean  of  the  parent  density  function 
x * a continuous  independent  variable 
f(x)  ■ a continuous  dependent  variable. 

This  function  has  been  demonstrated  to  be  a probability  density  function  by 
many  investigatots.  Part  of  their  investigation  involves  showing  that 

f(x)  dx  * 1.0. 
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The  cumulative  distribution  function  is  defined  as  follows: 


F(y) 


L 


f(x)  dx, 


(7.2) 


and  the  probability  that  a < x < b is  found  as  follows: 


P(a  < x < b)  = / f(x)  dx. 

'x=a 


L 


(7.3) 


From  equation  (7.3),  it  is  clear  that  P(x  = a)  * 0,  as  is  true  for  other  con- 
tinuous probability  density  functions.  The  normal  probability  will  be  used 
to  test  various  hypotheses  of  random  phenomena  in  the  AAES  ejection  data. 

7.2  The  Gamma  Probability  Density  Function 

The  gamma  probability  density  function  can  be  inferred  from  the 
gamma  function  which,  in  general  form,  is  defined  by  the  expression; 


T(a) 


1 r a-. 

5 / * 

p *y-0 


-x/8  , 
e dx . 


(7.4) 


From  equation  (7.4),  the  gamma  probability  density  function  immediately  emerges: 


f(x) ! x01-1  .',/6  ; («<«<*“ 


r(o)  8 


a > 0,  8 > 0 


Upon  inspection  of  equations  (7.4)  and  (7.5),  it  is  seen  that 


r 


f(x)  dx  * 1.0, 
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(7.5) 
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where  f(x)  is  defined  by  equation  (7.5).  A graph  of  the  gamma  probability 
density  function  is  shown  in  Figure  7-2. 


FIGURE  7-2.  GRAPH  OF  THE  GAMMA  PROBABILITY  DENSITY  FUNCTION 
FOR  VARIOUS  VALUES  OF  O AND  g 


In  equation  (7.5),  it  should  be  noted  that  the  exponential  prob- 
ability density  function  is  derived  whenever  a = 1.  Thus, 


f (X)  1 e“x/B  • 0 < x < + » 

1 *e  ’ )g>0 


(7.6) 


The  probability  density  function  defined  by  equation  (7.6)  is  used  quite  often 
with  success  in  reliability  and  maintainability  investigations.  Here,  "suc- 
cess" is  defined  as  the  discovery  of  a parent  probability  density  function. 

The  importance  of  such  a discovery  resides  in  the  predictive  capability  which 
it  gives  an  analyst.  Additional  exponential  probability  density  functions 
such  as  the  Weibull  density  function  may  be  employed  in  the  AAES  study  if 
they  are  deemed  appropriate.  The  Weibull  density  function  is  defined  by 
the  equation: 
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f2(x;  a,  6,  Y> 
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In  equation  (7.5),  note  that  f(x)  has  a singularity  at  the  origin 
whenever  0 < a < 1.  Further,  if  a = (n-l)/2  and  3=2,  the  Chi-square  probabil- 
ity density  function,  to  be  discussed  presently  in  Section  8,  emerges.  The 

.2 

normal  probability  density  function  can  be  derived  by  letting  x = z , a = 3/2, 
and  3=1-  Then  from  equation  (7.4), 


f_(z)  = —J—  e z 

3 ViT 


(7.8) 


From  equation  (7.5)  it  can  be  shown  that  the  cumulative  distribution 


function  is: 


Similarly, 


Finally, 


F(x)  = / f (y)  dy. 

4=0 


F(x)  = 


i r .-I 

^ L y 


-y/3  , 

e dy 


(7.9) 


P(a  < x < b)  = / f(x)  dx  = 1 - a 


(7.10) 


f(x)  dx  = a/2 


0 dx, 


where  f(x)  is  the  gamma  probability  density  function  defined  by  equation  (7.5). 
Equation  (7.11)  is  used  whenever  a hypothesis  is  being  tested  using  a two- 
tailed  test.  It  is  noted  in  equation  (7.11),  that  a represents  level-of- 
signif icance , not  one  of  the  parameters  in  the  gamma  probability  density 
function. 


7.2.1 


Example  of  Goodness-of-Fit  of  Data  to 
the  Gamma  Density  Function 


Goodness-of-Fit  can  be  measured  with  the  Chi-square  statistic, 
which  will  be  discussed  in  more  detail  in  Section  8.  As  an  example  of  ap- 
plication of  goodness-of-f it , consider  the  data  in  Table  4-3.  Note  that  there 
were  a total  of  107  ejection  related  injuries.  Normalize  these  injuries  by 
dividing  each  by  107,  form  a frequency  histogram,  and  graph  the  results  as 
shown  in  Figure  7-3.  The  normalized  data  are  ranked  according  to  increasing 
severity  of  injury.  Now  assume  validity  of  the  following  null  hypothesis: 


H : The  sample  of  ejection  injuries  shown  in  Figure  7-3  is 

a random  sample  extracted  from  a continuous  parent  density 
function  that  is  gamma  distributed. 


Test  of  this  hypothesis  can  be  performed  with  the  Chi-square 
itatistic  defined  by  the  equation: 


k , .2 

c — > (o.  - e. ) 


d.f.=k-l 


(7.12) 


where 


observed  frequency, 

expected  (theoretical)  frequency, 


number  of  frequency  cells  into  which  the  data  have  been 
grouped , 


d.f.  = degrees  of  freedom, 

2 . 
y = Chi-square  statistic. 
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1.  No/Minimal 

2.  Minor 

3.  Major 

4.  Fatal 

5.  Lost/Unknown 


FIGURE  7-3.  EXAMPLE  OF  GOODNESS-OF-FIT 


In  equation  (7.12),  the  observed  frequencies  are  noted  as  fol- 
lows: ox  = 0.1869,  o2  = 0.3552,  o3  = 0.2523,  o 4 = 0.1402,  and  o = 0.0654. 

Expected  frequency  is  determined  as  follows: 

3/2 

f(x)  dx,  (7.13) 

2 

where  f(x)  is  defined  by  equation  (7.5).  Similar  calculations  will  yield 
values  for  e2>  e^,  e^,  and  e^. 

The  frequency  histogram  in  Figure  7-3  is  interpreted  as  follows: 
The  first  frequency  cell  represents  the  fraction  of  ejectees  who  were  unin- 
jured, namely  0.1869.  Since  total  ejectees  is  107,  this  first  cell  repre- 
sents 20  uninjured/minimally  injured  ejectees.  The  second  cell  represents 
38  ejectees  with  minor  injuries.  The  third  cell  represents  27  ejectees  with 
non-fatal  major  injuries.  The  fourth  cell  represents  15  ejectees  who  experi- 
enced fatal  injuries.  The  fifth  cell  represents  7 ejectees  who  cannot  be 
accounted  for  and  are  presumed  fatalities.  A cumulative  distribution  function 
could  be  drawn  for  this  example. 

It  must  be  emphasized  that  the  above  is  actual  A-6  ejection 
related  injury  data  developed  from  the  MORs . 

7.2.2  Numerical  Integration  of  the  Gamma 

Density  Function 

To  obtain  theoretical  frequency  as  defined  by  equation  (7.13), 
it  is  necessary  to  integrate  the  gamma  probability  density  function  over 
the  interval  0.5  < x < 1.5.  Combine  equations  (7.3)  and  (7.5)  and  write: 

P(a  < x < b)  = / xa_1  e"x/e  dx.  (7.14) 

r<a)  Ba  7=a 
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There  are  two  cases  to  study:  (1)  oi  is  an  integer,  and  (2)  a is  not  an  integer. 


Case  1 . q is  an  integer 


For  this  case,  equation  (7.14)  can  be  written 


F(b)  - F(a)  = — $aa_1  e'a/Q  - B ba_1  e"b/6 


!»  B 


a 


(7.15) 


+ B (a  - 1) 


r ■■ 


2 -x/B  . 

e dx 


x=a 


q-k 


From  equation  (7.15)  it  is  easy  to  observe  that  eventually  x =1  whenever 
k = q,  so  that  a terminating  expression  exists  for  P (a  < x < b).  Generally, 
the  problem  of  primary  interest  is  that  of  finding  b such  that  a prescribed 
levels-of-signif icance  is  satisfied. 


Case  2.  q is  a not  an  integer 


Perhaps  the  easiest  way  to  compute  theoretical  frequency,  when- 


-x/B 

ever  ot  is  not  an  integer  is  to  expand  e in  a power  series,  multiply  by 


xa  integrate  term  by  term,  insert  limits  and  simplify  to  get: 


P(a  < x < b) 


1 |~(-  l)n  (ba*n  - aa+n)~ 

L (“  + n)  Ba+n  n!  _ 


r(q) 


(7.16) 


The  right  side  of  equation  (7.16)  can  be  shown  to  converge  by  the  ratio  test. 


To  consider  a specific  numerical  example,  let  a = 0,  b - 46, 

st 

q = 5.236,  and  B = 2.  Then  for  n = 70,  the  numerical  value  for  the  71 


term  is: 


23 


75.236 


u70  (4.236)  (3.236)  (2.236)  (1 . 236)  T (1 .236)  (75.236)  (70!) 
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The  T( 1.236)  can  be  looked-up  in  tables  such  as  [6],  or  computed  from  equa- 
tion (7.4).  Likewise,  for  large  n > 20;  n!  can  be  computed  from  Stirling's 
approximation 


n!  = V 2tt  n (n/e) 


(7.17) 


100 


Thus  for  n = 70,  Hi. 236)  = 0.90964,  70!  = 1.196432  x 10  , hence 


u7Q  = 0.0910577. 


Likewise , 


u,Q  = - 0.28053164. 
by 


In  a similar  fashion,  u_,  u , ...  , u , etc.  could  be  computed  and  an 

U 1 bo 

approximation  written: 


70 


P(0<  x<  46)  = iTsW  un  * 


(7.18) 


n-0 


where  u^  is  the  summand  in  equation  (7.16),  thus, 


, , .n  ,,  a+n  a+n. 

(-  1)  (b  - a ) 


(a  + n)  Ba+n  n! 


(7.19) 


The  above  numerical  process  can  be  applied  to  the  earlier  example  to  get  ex- 
pected (theoretical)  frequencies.  Knowing  o^'s  from  observation  and  e^'s 
from  numerical  integration  enables  one  to  compute  the  Chi-square  statistic, 
defined  by  equation  (7.12)  to  confirm,  or  not,  the  null  hypothesis  at  a given 
level-of-significance.  The  computation  shown  above  merely  illustrates  a 
method  for  numerically  calculating  theoretical  frequency.  No  goodness-of-f it 
calculations  are  involved. 
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7.3  The  Beta  Probability  Density  Function 


The  beta  probability  density  function  has  found  important  applica- 
tions in  Bayesian  statistics  in  recent  years  where  probabilities  are  some- 
times viewed  as  random  variables,  and  there  exists  a need  for  a flexible  prob 
ability  density  function  which  assumes  non-zero  values  only  on  the  interval 
0 < x < 1.  Here,  flexibility  implies  that  the  beta  density  function  can  as- 
sume a wide  variety  of  shapes.  As  appropriate,  the  beta  probability  density 
function  will  be  used  with  Bayesian  statistics,  discussed  in  Section  4,  to 
compute  refined  estimates  of  prior  probability  estimates  on  the  occurrence 
of  various  aircraft  operational  events  related  to  this  AAES  study. 

The  uniform  distribution 


f(x) 


1 ; 0 < x < 1 
0 ; elsewhere 


(7.20) 


is  a special  case  of  the  beta  probability  density  function.  The  beta  parent 
probability  density  function  is  defined  as  follows: 


f(x)  = 


Hot  + B) 

r(a)  r(e) 

0 ; elsewhere 


a-1  ^6“1 

x (1  - x) 


0 < x < 1 


(7.21) 


In  equation  (7.21),  a > 0 and  B > 0.  Note  that  the  uniform  density  function 
can  be  obtained  from  equation  (7.21)  by  letting  a = 3 = 1. 

The  incomplete  beta  probability  density  function  is  defined  thus: 


/’ 

•'0 


By  (a,  B)  » | f(x)  dx;  0 < x < y < 1, 


(7.22) 


where  f(x)  is  the  beta  probability  density  function  defined  by  equation  (7.21), 
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Levels-of-signif icance  and  confidence  intervals  for  the  beta  density 
function  can  be  defined  in  a way  similar  to  definitions  for  these  concepts 
already  displayed  for  other  probability  density  functions.  Thus, 


I f(x)  dx  = / f(x)  dx  = a/2, 


where  0 < a < b < 1 , and 


■J. 


P(a  < x < b)  = I f(x)  dx  = 1 - a 


(7.23) 


It  should  be  emphasized  that  the  a in  equation  (7.23)  represents 
level-of-signif icance  rather  than  a parameter  in  either  the  gamma  or  beta 
probability  density  functions. 
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APPENDIX  A 
Section  8.0 

SMALL  SAMPLE  STATISTICAL  ANALYSIS:  SAMPLING 

PROBABILITY  DENSITY  FUNCTIONS 


■■■,  ; * 


APPENDIX 


8.0  Small  Sample  Statistical  Analysis:  Sampling 

Probability  Density  Functions 


It  Is  true  in  many  statistical  analyses  that  relatively  large  samples, 
that  is  samples  having  many  elements,  are  studied.  Under  those  conditions, 
various  valid  simplifying  approximations  can  be  made.  Thus,  discrete  den- 
sity functions  such  as  binomial  and  Poisson  approach  the  normal  density 
function;  the  Stirling  approximation  to  n!  can  be  used  with  good  accuracy; 
and  the  mean  and  variance  for  a run  test  can  be  computed  under  the  assump- 
tion of  the  approximation  of  the  discrete  parent  density  function  to  the 
normal  density  function. 


Frequently,  however,  data  such  as  AAES  ejection  data,  are  obtained  only 
in  small  sample  sizes.  Accordingly,  small  sample  statistical  theory  must  be 
brought  to  bear  on  the  statistical  problem  under  consideration.  This  section 
will  present  very  briefly  some  pertinent  concepts  from  the  field  of  exact 
sampling  and  demonstrate  how  these  can  be  used  in  the  statistical  analysis  of 
AAES  data.  Numerical  examples  are  given  to  illustrate  various  tests  performed. 

8.1  Joint  Frequency  Function  for  Sample  Mean  and  Sample  Variance 

The  joint  frequency  function  for  a sample  of  N independent  normal 
random  variates  x^,  x^  is 

,,  . 2.-N/2  -v2/2o2  .v 

f(x^,  x2,  ...  , xN)  = (2ir  a ) e (8.1) 

where 


2 

v 


N N 

i»l  i=l 


(x£  - x + x)2 


(8.2) 


2 

v 


Ns 


+ Nx2 


9 
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mnasa mmsamm 


and 


X <xi 


x)  = 0 


Equation  (8.1)  shows  that  f(x^,  x2,...,  x^  is  a function  of  the  sample 
mean  x and  sample  variance  s2.  To  find  the  distribution  of  x and  s 
separately,  change  from  the  set  x^,  x2>...,  x^  to  the  set  x,  s,  w^,  w2> 
...»  w„  _,  where  the  w, 's  are  chosen  in  a way  such  that  the  following 
two  conditions  hold: 


52  (x  - x)  = 0 
i 


(8.3) 


52  (x4  - x)2  = Ns2 

i 


(8.4) 


The  transformation  from  one  set  of  variables  to  another  is  performed  by  means 


of  Jacobians.  Thus, 


dx^  dx2  ...  dxN  = | J | dx  ds  dw^  ...  dwN_2  > 


where 


J = J 


*1  * x2 XN 


x»  s»  wjj_2 


or  in  expanded  form  the  Jacobian  J,  as  a determinant  of  the  transformation 
from  one  space  to  another,  can  be  written  as  follows: 


1 ....  - ..  \ ' 


The  frequency  function  in  the  new  variables  is 


(8.5) 


fj_  (x,  s,  wx,  , wN_2)  = I J I f(x,  s)  , 


(8.6) 


where  f(x,  s)  is  defined  by  the  right  side  of  equation  (8.1),  and  is  a func- 
tion of  x and  s only.  By  integrating  equation  (8.6)  over  all  w^s,  f(x,  s) 


is  obtained. 


follows : 


The  relations  between  the  two  sets  of  variables  may  be  taken  as 


x,  = x + s N w w,  w_  ...  w„  0 w 
1 o 1 z N-J  N-z 

- ^ Ml/2  zi  2,1/2 

x2  = x + s N wq  wx  . . . wN_3  (1  - wN_2  ) 

- . 1/2  2 ,1/2 

x3  = x + s N wQ  w3  ...  wN_4  (1  - wN_3) 


- . m1/2  2,1/2 

Xjj_1  = X + s N wo  (l  - wx) 

- . „l/2  2,1/2 

Xjg_2  = x + s N <1  “ WD> 


(8.7) 


- 2 2 

It  is  easy  to  see  from  equations  (8.7)  that  E^  (x^  - x)  = Ns  . To  show  that 
E^  (x^  - x)  = 0,  the  variable  wq,  introduced  earlier,  must  be  expressible  in 
terms  of  w^,  w2>...,  wN  2 • Clearly,  this  is  the  case,  hence  both  conditions 
defined  by  equations  (8.3)  and  (8.4)  are  satisfied. 


From  equation  (8.7)  certain  values  can  be  inserted  into  the  Jacobian 


J defined  by  equation 

(8.5). 

Thus, 

1 

H1/2 

"l.2 

s„1/2 

w 

o 

ni,3  * 

. . sN1/2 

w 

o 

ni,K 

1 

N1/2 

n2.2 

SN1/2 

w 

o 

en 

* 

CM 

a 

. . SN1/2 

w 

o 

n2.» 

1 

N1/2 

n3.2 

sN1'2 

w 

o 

n3,3  • 

. . sN1/2 

V 

o 

n3.» 

J = 


i n1/2  n sn1/2  w n . . sn1/2  w n 

N,2  o N,3  o N,N 


(8.8) 


In  equation  (8.8),  n = H (w)  only,  that  is,  they  represent  products  of 

i»J  j/2 

w^'s  only.  The  Jacobian  can  be  written,  upon  factoring  out  all  the  N and  s. 

J - B<8-1>/2  sH-2  |d|  , (8.9) 

where  |d|  is  a determinant  containing  only  w's. 

From  equations  (8.1)  and  (8.6),  the  frequency  function  is  written 


thus : 


, N-2  -N(x2+s2)/2o2  ,0 

f ^ (x , S,  w^,  w2,  ...  , wN_2)  = C1  I D J s e (8.10) 


where  is  a constant  depending  only  on  N and  a.  Integrate  both  sides  of 
equation  (8.10)  over  all  w^,  w2,...,  wN_2  to  get: 
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i i'  i ■J  i ■. 


J 


, v _ -Nx2/202  N-2  -Ns2/2o2 

f2(x,  s)  = C2  e s e 


!-  00  < X < + 

0 _<  s2  < + « 


(8.11) 


The  actual  integration  of  equation  (8.10)  needs  only  to  be  indicated.  The 
integration  of  |d|  is  a constant  and  is  absorbed  into  the  constant  C^.  A 
numerical  value  for  C2  is: 


C2  = 


2 (N/2ct2)N/2 

vATrfe' 


8.2  The  Probability  Density  Function  for  the  Sample  Mean 

Clearly  the  joint  density  function  defined  by  equation  (8.11)  is 
comprised  of  two  independent  density  functions:  one  for  x,  the  other  for  s. 

Each  of  these  is  written  as  follows: 


f3(x)  = C3  e 


-2  2 
-Nx  /2o 


(8.12) 


and 


f4(s)  - C,  s’1"2  e-Ns2/202 


(8.13) 


Since 


/ 


f3  (x)  dx  = 1.0  , 


it  follows  from  performing  the  integration  that 


C3  - (N/2tt  a2)1/2 


(8.14) 


Hence , 


, /M/0  2.1/2  -Nx2/2o2 

f 3 (x)  = (N/2tt  a ) e 


(8.15) 
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8.3  The  Probability  Density  Function  for  Sample  Variance 


Clearly, 


f^(s)  ds  = f,.  (s2)  d(s)2,  or 
f4(s)  = f5  (s2)  (2s), 


(8.16) 


hence  write  equation  (8.13)  in  terms  of  s instead  of  s as  follows: 


2 2 

, 2.  c5  <s  > -Ns2/2cj2 

f5  (S  ) = i e 


±1  2 2 

, , 2.  , 2.  2 -Ns  /2a 

f5  (s  ) = C5  (s  ) e 


(8.17) 


Again,  note 


f5  (s2)  d(s2)  = 1.0 


From  this  fact,  it  is  easy  to  show  that 


' W ' 


and  finally,  equation  (8.17)  becomes: 
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.*Tt; 


(8.19) 


Thus, 


and  equation  (8.12)  becomes 


(8.21) 
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is  an  odd  function  of  x,  that  is,  g(-x)  ■ - g(x).  Since  the  integral  of  an 
odd  function  over  symmetric  limits  is  zero,  it  follows  that 


E(x)  = 0 


(8.23) 


To  compute  the  variance  of  the  sample  mean,  write 


Var(x)  = v2  " V1  ’ 


(8.24) 


where  is  the  second  moment.  From  equation  (8.23),  = 0,  hence  Var(x)  = 


Var (x) 


(*)“•  L 


-2  2 
-2  -Nx  /2 a 
x e dx 


-2  - 1/2  - - 
Let  z = x ; x = z ; dz  = 2x  dx 


Then, 


Var (x) 


■ 1 (*)’”  ( 


z e 


-Nz/2o  dz 


2z 


1/2 


Var(x) 


W"  ( 


1/2  -Nz/2o  J 
z e dz 


r(~3/I  5 r(3/2) 


(1/2)  r(i/2)  = 


Pi 


This  last  expression  reduces  to 


Var(x)  = o /N 


(8.25) 
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Substitute  equation  (8.19)  into  (8.26)  to  get: 

N-l 


From  equations  (8.16)  and  (8.17), 


f4(s)  = 
E(s)  = 

E(s)  = 


E(s)  = 


2 C, 


N-2  -Ns2/2o2 

s e 


(8 


(8 
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The  variance  of  s can  be  found  by  use  of  the  moment  generating 
function.  Thus, 

00 

Sr  f^(s)  ds  (8 


Substitute  equation  (8.29)  into  (8.31)  and  get: 

N-l 


2 *'0 


This  last  equation  reduces  to 


..  . . 1 „ , 2 r (N/2)  2 

Var(s)  = - N-l  - 0 * 


r2  (t)  = [ r (t ) ] 2 


An  approximate  value  for  Var(s)  can  be  written: 


Var(s)  = krr 


= T—  — 

* 2N  ' 8N 


2 16N3 


so  that  approximately, 


o2  = Var(s)  = o2/2N 
s 


8.7  The  Chi-Square  Probability  Density  Function 


(8.33) 


(8.34) 


(8.35) 


The  Chi-square  probability  density  function  can  be  obtained  from 


equation  (8.19)  by  choosing  \ = Ns 2/o2.  Making  this  substitution  yields 


the  following: 


k(N-3)/2 


-X  /2 


f6  (x2) 


(8.36) 


hL-  A ' • .v 


where,  to  derive  equation  (8.36),  use  is  made  of  the  equation 


f5(s2)  d(s2)  = f6^X2)  dx2 


(8.37) 


from  which  it  immediately  follows  that 


f 5 (s2)  = f6(x2)  (N/a2) 


(8.38) 


Equation  (8.36)  is  one  form  of  the  Chi-square  probability  density  function. 


8.8  The  t-Probability  Density  Function 


The  t-density  function,  here  presented  without  derivation  is 


E (t)  = K (l  + — ) 
n n l n I 


-(n+l)/2 


; n = N - 1 


(8.39) 


where 


1/K  = n1/2  B(n/2 , 1/2). 

n 


(8. AO) 


B represents  the  beta  function  which  can  be  expressed  as  follows: 


, F (m)  T(n) 

B(m’  n)  " r(m  +^0 


(8.41) 


in  which  T(m)  is  the  gamma  function  of  m.  Combining  equations  (8.40)  and 
(8.41),  it  is  easy  to  show  that 


M 


'riiT  T(n/2) 


(8.42) 
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or  in  terms  of  N,  where  n = N - 1 


•s. 


r(N/2) 


vTn 


Substitute  (8.42)  into  (8.39)  to  get: 


f (t)  = 

n vW 


#)-  L * *i\ 

w r (n/2)  \ n / 


-(n+l)/2 


The  t-statistic  is  defined  by  the  expression: 


1/2 


„ (x  - u)  (N  - 1)*'“ 
c * _ » 


(8.43) 


(8.44) 


(8.45) 


which  shows  that  f (t)  is  a function  only  of  the  population  mean,  p.  This 
n 

function,  fn(t),  is  useful  in  testing  the  null  hypothesis  that  a given  sample 

is  extracted  from  a universe  with  a given  mean.  Note  that  f (t)  is  inde- 

2 " 
pendent  of  the  population  variance,  o . 

It  can  be  shown  that  for  n sufficiently  large,  the  t-density  func- 
tion approaches  the  normal  density  function.  Recall  that  for  large  n, 
Stirling's  approximation  gives 


n!  = V2ir  n (n/e)n 


Also, 


From  this  information,  K is  written  thus: 

n 


(8.46) 


which  immediately  simplifies  to  the  following 


Write  the  variable  portion  of  f (t)  as  follows 


Clearly 


Thus 


as  was  to  be  shown 


The  t-distribution  has  been  found  useful  in  such  problems  as  testing 
the  significance  of  the  difference  between  two  means  and  testing  hypotheses 
regarding  regression  coefficients.  To  illustrate  the  former  concept,  let 
and  x2  be  the  means  and  s^  and  s2  the  standard  deviations  of  two  inde- 
nonHpnt  aamnlps  of  size  N.  and  N_  variates,  respectively.  It  is  assumed  that 


the  samples  are  extracted  from  a universe  which  Is  normally  distributed  with 

2 

mean  y and  variance  a . The  variance  of  the  difference  between  the  two  sample 

2 

means  can  be  shown  to  be  o + N^/J^  t*2  [7,  pg.  79].  Hence  the  t-variable 

can  be  expressed  as  follows: 


- - 1/2 

X1  - X2  / N1  N2  V 

a • 


(8.48) 


In  practice,  an  estimator  is  used  in  place  of  o.  The  one  used  is 


2 _ "2  _ Nx  si  + n2  s2 
° ” ° Nx  + N2  - 2 


(8.49) 


Substitute  equation  (8.49)  into  (8.48)  to  get: 


t = 
o 


X1  ~ X2 


vl/2 


(\  + N2  “ 2V/2 
\ N1  + N2  / 5 n Nl  + N2  ~ 


(8.50) 


Several  special  cases  are  an  immediate  consequence  of  equation  (8.50).  Thus, 
if  and  N2  are  sufficiently  large,  equation  (8.50)  becomes: 


n = Nx  + N2 


- 2 , 


(8.51) 


where  n can  be  viewed  as  the  number  of  degrees  of  freedom  in  the  t-statistic. 


If  N2  >>  N^,  then  equation  (8.50)  reduces  to  the  following  whenever 


x2  -*  y and  s^  -*■  a: 


(8.53) 


Now  consider  an  example  of  the  use  of  the  t-statistic.  The  data 
used  in  this  example  are  actual  ejection  related  injury  data  extracted  from 
the  MORs. 

The  following  table  represents  random  samples  of  ejection  related 
injury  data  upon  ejection  from:  (1)  A-4  aircraft  with  ESCAPAC  ejection  sys- 

tems, and  (2)  A-6  aircraft  with  Martin-Baker  ejection  systems.  For  these  two 
random  samples,  is  there  a significant  difference  in  ejection  related  injury 
history  using  the  difference  between  their  means  as  a judgment  criterion? 

Injury  categories  in  Table  8-1  are  the  same  as  those  defined  in 
Table  4-3.  From  the  data  in  Table  8-1,  the  following  numbers  are  developed: 


TABLE  8-1 


Sample  1 
A-4  Injury  Data 


Sample  2 
A-6  Injury  Data 


E = 51.0;  n = 18 


E = 48.0;  n2  = 16 


Substitute  these  numbers  into  equation  (8.50): 


2.83334  - 3.0000 

264706  2.133334> 

16  18 


(18  + 16  - 2V 
18+16  I 


t = - 0.31706 
o 


For  a level  of  significance  a = 0.05,  and  n = 32  degrees  of  freedom. 


t0. 025/d. f .=32  " 2,0378 
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Likewise 


'-0. 025/d. f . = 32 


2.0378 


Since  t . < t < t,  that  is,  since  -2.0378  < -0.31706  < 2.0378,  the  null 

hypothesis  that  the  two  samples  studied  are  extracted  from  parent  universes 
having  the  same  mean  is  accepted  at  the  a = 0.05  level  of  significance.  Thus, 
we  reject  the  hypothesis  that  a significant  difference  exists  between  A-4  and 
A-6  ejection  related  injury  patterns,  using  the  t-test  as  a judgment  criterion. 

8 . 9 The  F-Probability  Density  Function 

The  F-probability  density  function  is  used  to  test  the  hypothesis 

2 2 

that  two  random  samples  of  size  and  N2 , having  variances  s^  and  S2,  are 
extracted  from  the  same  parent  density  function  which  has  variance  . A de- 
tailed study  of  the  derivation  of  the  F-parent  density  function,  as  a special 
case  of  the  Pearson  Type  IV  curve  [7,  pg.  105],  is  outside  the  scope  of  this 
paper.  An  excellent  description  of  arguments  showing  how  the  F-density  func- 
tion is  derived  from  Fisher's  z-distribution  can  be  found  in  [7,  pp.  180-182]. 
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(x2i  - x)2 


represent  variances  for  sample  1 and  sample  2 extracted  from  a parent  density 

2 

function  having  population  variance  a . Since  n^  = - 1 and  n2  = N2  - 1,  an 

unbiased  estimator  of  the  population  variance  from  sample  1 is 


N s2 
*2  1 1 

°i  = T » ni 

1 n^  1 


Ni-! 


and  for  sample  2 


"2  ^2  S2 

°2  = ir1 5 n2  - n2  - 1 


(8.57) 


To  test  the  hypothesis 


~ 2 2 

H : o:  = o_  = a , construct  the  statistic: 

o 1 2 


F = 


Ni  V°i 

N2  s2/n2 


(8.58) 


Clearly, 


nl  F N1  Sl/cj2 

n2  N2  s22/o2 


(8.59) 
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in  which  s2/o2  is  with  (t^  - 1)  degrees  of  freedom  and  N2  s 2/o  is  x2 
distributed  with  (N2  - 1)  degrees  of  freedom.  The  variable  ^ F/n2  is  dis- 
tributed according  to  a Pearson  Type  IV  curve  mentioned  earlier.  The  expres- 
sion is: 


(8.60) 


in  which 


♦ 


V 

*. 


pi 

» t 


t 


K 


is  the  beta  function  defined  earlier.  Equation  (8.60)  can  be  written: 


f (F) 


(n  -2)/2  ( n A 

K F 1 (l  + — — I 

\ V 


V*2 


; 0 £ F < + 


(8.61) 


where 


K = — 


(8.62) 


To  illustrate  use  of  the  F-statistic,  consider  two  samples  of  sizes 
n^  + 1 and  n2  + 1 from  normal  populations  with  means  p^,  p 2 and  variances 


2 2 


V °2  ’ 


respectively.  Define 


2 2 

e - o^/o 2 , 


and  let  the  hypothesis  be 


Hq:  6=1,  regardless  of  the  actual  values  of 


2 2 ,, 

°1*  °2*  W1  and  V2  * 


Let  A^2  and  B12  be  two  numbers,  depending  on  n.^  and  n2,  chosen  so  that  for  any 
given  n^  and  n2< 


B 


■/ 


12 


F = | f (F)  dF  = 1 - a 

'A, 


(8.63) 


12 


where  a is  the  level  of  significance.  The  geometry  is  shown  in  Figure  8-1. 
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FIGURE  8-1.  GRAPH  OF  THE  F-PROBABILITY  DENSITY  FUNCTION 


Another  way  of  writing  equation  (8.63)  is 


f (F)  dF  = 


f (F)  dF  = a/2 


(8.64) 


From  equation  (8.64),  it  follows  that  A ^ = 1/®21’  w^ere  ®21  t*ie  num^er  °b~ 
tained  from  B.^  by  interchanging  n^  and  n2-  Now,  define  u = 1/F,  then  equation 
(8.64)  can  be  written: 


so  that  - l/A^*  since  the  right  side  of  (8.65)  is  the  F-integral  with 
n^  and  n2  interchanged. 
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Due  to  the  relationship  between  and  B12*  ma*ces  no  difference 
which  sample  is  labeled  number  1 and  which  is  labeled  number  2.  With  one 
arrangement,  Hq  is  rejected  unless  A ^ £ F <_  B^*  With  the  other  arrangement, 

Hq  is  rejected  unless  k^  <_  1/F  <.  B21*  But  B21  = *^A12  an<*  A21  = ^B12*  ^rom 
this,  it  is  clear  that  conditions  of  rejection  are  the  same. 

To  apply  the  above  to  a numerical  example,  consider  the  injury  data, 
extracted  from  the  MORs,  shown  in  Table  8-1.  Thus,  = 18,  ^ = 16; 
n^  = - 1 = 17,  n2  = N2  “ 1 = 15; 

s^  = 2.264706;  s^  = 2.133334. 


The  null  hypothesis  is 


H : 
o 


2,  2 

°l/o2 


°1  ” 


1. 


versus  the  alternate  hypothesis 


Substituting  the  numbers  developed  into  equation  (8.58)  yields: 


F 

o 


1 


2 


Hence, 


_ (18) (2.264706)717 
o (16) (2. 1333 34)/ 15 


F = 1.053774 
o 

Since  FT(17,  15)  = 2.32,  from  tabulated  values  of  the  F-statistic  at  a = 0.05, 
the  null  hypothesis  Hq  is  accepted,  and  we  conclude  that  the  two  samples  under 
investigation  were  extracted  from  parent  density  functions  having  the  same 
variance . 


2-Al.  . cr.v-.  ;*  -V  j-wtc- 
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8.10  Confidence  Limits 


Upon  applying  the  statistical  techniques  outlined  above,  it  is 
highly  desirable  to  be  able  to  measure  the  confidence  invested  in  the  computed 
results.  This  measure  evolves  from  a study  of  confidence  limits.  These  will 
be  discussed  and  applied  as  appropriate. 

8.10.1  Confidence  Limits  for  the  Mean 

From  equation  (8.45),  the  t-statistic  is  defined  as  follows 
_ (x  - y)  Vn 


where,  as  before,  n * N - 1.  To  illustrate  the  discussion  that  follows,  con- 
sider Figure  8-2  which  is  a sketch  of  the  t-density  function,  showing  (1)  ac- 
ceptance region,  (2)  rejection  region  (cross-hatched  area),  and  (3)  a t- 
statistic,  tQ,  computed  from  some  data.  From  Figure  8-2,  it  is  clear  that  tQ 
is  in  the  acceptance  region,  since  - t^^  < tQ  < t^^*  Fr°m  the  definition  of 
the  t-statistic,  it  is  clear  that  the  above  inequality  can  be  written: 


-t 


«/2 


< < t 


i/  v/n 


a/2 


(8.66) 


which  easily  reduces  to 


x 


< vi 


< x + 


v^T 


■a/2 


(8.67) 


FIGURE  8-2.  GRAPH  OF  THE  t-PROBABILITY  DENSITY  FUNCTION 
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Inequality  (8.67)  can  be  read  as  follows:  The  parent  population  mean,  y,  will 

lie  between  the  limits 

* - ta/2  and  * + ta/2*  100  * (1  “ a) 

percent  of  the  time.  Compactly  written  this  becomes: 

P (x  - — — t < U < X + -2—  t . | =*  1 - a , (8.68) 

\ a/2  o/7 

where  P ( ) means  the  probability  of  ( ) . 

' 

To  apply  this  to  a concrete  example,  consider  the  ejection 
injury  data  shown  in  Sample  1 of  Table  8-1.  Two  questions  need  to  be  addressed: 

(1)  assume  the  data  are  extracted  from  a normal  parent  density  function  with 

2 

mean  y and  variance  o . What  are  the  bounds  on  the  parent  mean,  y,  and 

(2)  what  confidence  can  be  invested  in  this  assertion?  Let  a = 0.05,  and 
d.f.  = 17.  Compute  the  t-statistic  thus: 


= \ ^ _ 2.833334  - y 
° Sj/Vn  1.504894/VT7 


mmm m 


This  simplifies  to: 

2.19825  < v < 3.46842  (8.69) 

The  answer  to  the  second  question  is  easy: 

P(2. 19825  < v < 3.46842)  = 1 - 0.05  = 0.95  (8.70) 

Thus,  it  can  be  said  with  95%  confidence  that  the  parent  density  function  from 
which  this  particular  sample  was  extracted,  has  a mean  p that  will  lie  between 
2.19825  and  3.46842  about  95%  of  the  time. 

8.10.2  Confidence  Limits  for  the  Difference 
Between  Two  Means 

To  get  confidence  limits  for  the  difference  between  two 
means,  first  consider  equation  (8.50)  and  rearrange  it  slightly  as  follows: 


X1  " X2 


7ni  si 

+ "2  Sf\ 

K + ^ 

1/2 

\h  + 

N2-2  j 

\h  N2  ) 

Now  make  the  following  definition: 


w = - x2  = (Xj^  - U1)  - (x2  - p^ 


o»  = PX  - P2 


Pi  31  + N2  S2 

°w  " l Nj_  + N2  - 2 


'T' 


The  new  t-statistic  becomes 


or 


(8.73) 


Equation  (8.73)  was  derived  under  the  following  conditions:  Let  x^  and  s^  be 

the  observed  mean  and  variance  of  a random  sample  of  size  N.  drawn  from  a 

2 X 

normal  universe  with  unknown  mean  p^.  Also,  let  X2  and  S2  be  the  observed 


mean  and  variance  of  a random  sample  of  size  ^ drawn  from  a normal  universe 
with  unknown  mean  p2 • Ic  is  assumed  that  the  two  universes  have  the  same  vari- 
ance. Equation  (8.73)  results  from  the  preceding  sampling  information. 

To  evaluate  confidence  limits  for  the  difference  between 
two  means,  consider  again  the  ejection  related  injury  data  in  Table  8-1,  and 
substitute  those  numbers  into  equation  (8.73): 


From  Inequality  (8.74),  it  follows  immediately  that 


P(-  0.90452  < (y  - u2)  < 1.23785)  = 0.95  (8.75) 


This  says  that  the  difference  between  the  means  of  the  parent  density  functions 
will  lie  between  - 0.90452  and  1.23785  about  95%  of  the  time. 


8.10.3  Confidence  Limits  for  the  Variance 


To  compute  confidence  limits  for  the  parent  variance,  first 

2 2 2 

recall  equations  (8.56)  and  (8.57)  and  note  that  N s^/o  is  x distributed 


with  (N^  - 1)  degrees  of  freedom.  Thus, 


N1  S1  2 . N2  S2 
— s — < a < — = — 


(8.76) 


Choose  a = 0.05,  hence,  from  the  data  in  Table  8-1,  there  are  n^  = 


- 1 = 17 


and  n2  = N2  - 1 = 15  degrees  of  freedom,  respectively.  Then, 

2 


'2/d. f. =15;  a/2=0.025 


27.488 


and 


xl/d.f.=17;  a/2=0.975  “ 7,564 


These  numbers,  together  with  those  developed  in  Table  8-1,  yield  the  following: 


(18) (2.264706)  2 < (16) (2.13334) 


27.488 


7.564 


Hence, 


1.4830  < o4  < 4.51260 


(8.77) 
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Thus, 


P (1.4830  < oT  < 4.51260)  - 0.95 


(8.78) 


8.10.4  Confidence  Limits  for  the  Standard  Deviation 


For  the  numerical  example  cited  above,  confidence  limits 
on  the  standard  deviation  can  be  found  by  taking  the  square  root  of  all  mem- 
bers of  inequality  (8.77).  Thus, 


1.21778  < o < 2.124288  , 


(8.79) 


and  likewise. 


P(l. 21778  < o < 2.124288)  = 0.95 


(8.80) 


8.10.5  Confidence  Limits  for  the  F-Statistic 


As  mentioned  earlier,  the  F-statistic,  defined  by  equa- 
tion (8.58),  is  used  to  test  various  hypotheses  regarding  parent  population 
variances.  The  hypotheses  tested  include: 


(1) 


(H  : °1 

I,,. 
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Test  with  a two-tail 
test. 


(2) 


„ 2 

(H  :■  o,  * a 
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H : of  < a 


o 1 


Test  with  a left-hand 
tail  test  only. 


(3) 


| V °i  ■ 
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Test  with  a right-hand 
tail  test  only. 
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The  F-test  illustrated  earlier  was  an  example  of  (3)  above, 

2 2 2 2 
namely  testing  Hq:  versus  the  alternative  hypothesis  H^:  > a2’ 

At  the  a = 0.05  level  of  significance,  it  was  found  that  the  null  hypothesis 

was  accepted.  Parenthetically,  it  should  be  noted  that  in  this  example,  / 

Al0  = 0,  = F(17,  15)  = 2.372,  and  F = 1.0538,  so  that  the  upper  tail 

lz  lz  o 

(right-handed  tail)  in  the  F-density  function  is  the  only  critical  region  con- 
sidered. Thus,  since  0 < Fq  < F , or  numerically,  0 < 1.0538  < 2.372,  then, 

P(0  < 1.0538  < 2.372)  = 0.95  , (8.81) 

so  that  for  this  particular  example,  it  can  be  stated  with  95%  confidence  that 
the  two  samples  shown  in  Table  8-1  were  extracted  from  parent  density  functions 
having  the  same  variance. 


To  generalize  the  above  numerical  results  somewhat,  let 


N1  si/<Ni-1)  N1  Sl/nl 

° N2  s2/(N2-1)  N2  s2/n2 
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f (F)  dF  = a 


. P(A10  < F < + oo)  = i - a 
V 12  o 


(3) 


f (F)  dF  = a 


12 


For  the  numerical  example  above,  data  from  Table  8-1  was  used  and  case  (3) 
above  was  the  one  under- consideration.  Here  it  should  be  recalled  that 
A12  = and  B12  = 1/A^. 
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APPENDIX  B 

Future  Efforts  Under  Phases  II,  III  and  IV  of 
AAES  Analyses 


Section  9.0 

i 

1 

PROPOSED  APPLICATIONS  OF  PHASE  I STATISTICAL 
ANALYSIS  METHODOLOGY  TO  SPECIFIC  AAES 


I 


APPENDIX  B:  FUTURE  EFFORTS  UNDER  PHASES  II,  III  AND  IV 

OF  AAES  ANALYSES 

I 

9.0  Proposed  Applications  of  Phase  I Statistical  Analysis 
Methodology  to  Specific  AAES 


Analyses  to  be  applied  to  a specific  AAES  are  summarized  in  Figures  9-1 
and  9-2.  The  first  of  these  proposes  application  of  methodology  developed 
in  Phase  I to  a single  attribute  data  stream.  Here,  an  example  of  a single 
attribute  data  stream  could  be  injury  data  versus  chronological  ejection 
history. 

Use  of  Figure  9-1  is  along  the  following  lines:  (1)  Select  an  AAES  to 

be  studied.  This  selection  would  be  made  from  the  ESCAPAC,  Martin-Baker-MK7 , 

Rockwell  International,  SIIIS-3  (Stencel),  or  other  systems  under  evaluation; 

(2)  Select  the  single  attribute  data  stream  to  be  studied;  (3)  Re-format  the 

I j 

data  if  necessary,  or  desirable,  thereby  making  it  more  easily  quantified; 

(4)  Apply  Bayesian  statistics,  if  appropriate,  to  refine  a priori  probability 
estimates;  (3)  Perform  non-parametric  (distribution  free)  tests  such  as  the 
run  and  trend  tests.  Next,  subject  the  data  to  hypothesis  testing:  (1)  Apply 

the  binomial  density  function  to  test  various  dichotomous  attributes  the  data 
may  have;  (2)  test  randomness  of  the  data  by  hypothesizing  that  a random  sample 
of  size  n is  extracted  from  a normal  parent  density  function;  (3)  if  appro- 
priate, test  whether  the  data  sample  is  extracted  from  a continuous  parent 
density  function  which  is  gamma  distributed;  and  (4)  apply  the  t-density  func- 
tion to  test  whether  the  sample  mean  is  related  ("found  in  a given  neighbor- 
hood") to  a given  parent  density  function  mean.  Confidence  limits  will  be 
given  to  appropriate  hypothesis  tests. 

! 

In  the  case  of  dual  attribute  data  streams,  as  illustrated  by  Figure  9-2 
for  a given  AAES,  analyses  would  proceed  as  follows:  (1)  Select  dual  attri- 

bute data  streams,  for  example,  injury  versus  speed,  injury  versus  altitude, 
etc.;  (2)  if  desirable,  pass  each  attribute  through  the  single  attribute  data 
stream  analysis  system  described  by  Figure  9-1;  (3)  perform  regression  anal- 
ysis on  two  data  streams;  (4)  subject  the  dual  streams  to  low  order  contin- 
gency table  data  analysis;  (5)  as  appropriate,  apply  higher  order  contingency 
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FIGURE  9-2.  STATISTICAL  ANALYSES  OF  DUAL  ATTRIBUTE  DATA  STREAMS 
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table  data  analysis;  and  (6)  compute  correlation  and  contingency  coefficients 
as  needed.  The  next  major  investigation  would  be  to  test  various  dichotomous 
attributes  within  and/or  between  the  two  data  streams.  This  includes  applica- 


tion of:  (1)  the  binomial  parent  density  function;  (2)  the  Poisson  parent 

density  function,  and  (3)  the  bivariate  Gaussian  parent  density  function. 

For  testing  the  effect  of  more  than  two  independent  outcomes;  (4)  use  the 
multinomial  parent  density  function.  To  test  for  equality  of  parent  means, 

(5)  use  the  t-statistic,  and  (6)  to  test  for  equality  of  parent  variances, 
use  the  F-statistic.  Confidence  limits,  as  appropriate,  will  be  computed 
for  the  various  tests  performed. 

. ... 

There  are  several  major  areas  that  will  require  numerical  procedures: 

(1)  Front  end  data  decoding  and  analysis  (re-formatting);  (2)  numerical  in- 
tegration of  various  parent  density  function,  (3)  application  of  various  sta- 
tistical tests,  (4)  numerical  computations  involved  in  correlation  analysis, 

(3)  contingency  table  data  analysis,  and  (6)  hypothesis  testing. 

Phase  II  analyses,  as  outlined  above,  also  will  be  used  to  screen  the 
analyses  proposed,  to  test  their  applicability  and  possibly  identify  other 
analyses  that  are  appropriate. 
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10.0  Outline  of  Phase  III  Activities 

Proposed  activities  in  Phase  III,  together  with  background  information, 
is  shown  schematically  in  Figure  10-1. 


FIGURE  10-1.  PROPOSED  APPLICATION  OF  THE  DERIVED  STATISTICAL 
ANALYSIS  METHODOLOGY  TO  MORE  THAN  ONE  AAES. 

The  analyses  developed  and  results  obtained  in  Phase  III  will  be  used 
to  assist  designated  laboratories,  systems  designers/manufacturers  and  other 
selected  activities  understand  the  approach,  the  results  of  the  investiga- 
tion, and  significance  of  the  results  so  obtained.  Interpretations  will  be 
thoroughly  presented.  This  is  deemed  appropriate  in  that  certain  corrective 
actions  may  need  to  be  developed  so  that  existing  and/or  future  AAES's  can 
be  improved. 


11.0  Sketch  of  Phase  IV  Activities 


Following  successful  completion  of  Phases  I,  II,  and  III,  the  fourth 
phase  of  this  project  will  be  structured  along  the  following  lines: 

• Program  Planning  and  Development 

Here,  plans  will  be  developed  and  executed  on  an  annual  basis. 

This  will  be  a continuing  review.  Inputs  provided  for  this  review 
will  include  the  following:  (1)  data  for  the  reported  year,  and 

(2)  data  for  the  reported  year  and  the  four  preceding  years.  Such 
procedures  will  assist  a comparative  evaluation  of  the  effects  of 
introducing  AAES  improvements  on  existing  and  new  AAES's. 

• Awareness  of  Specification  Modification 

In  the  execution  of  Phases  I through  IV  maintain  cognizance  of  the 
constant  need  for  technical  specification  updating.  Special  effort 
will  be  exerted  in  the  area  of  data  and  test  requirements.  This 
task  will  be  done  concurrently  with  the  specific  tasks  outlined 
above. 

• Problem  Isolation 

At  all  times  special  problems  that  are  discovered  during  the  accom- 
plishment of  the  above  tasks  should  be  brought  to  the  attention 
of  NAVAIR-531. 
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